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Since  the  bootstrap  method  was  introduced  by  Efron  in  1979,  there  have  been 
many  applications  in  estimating  models,  generating  bootstrap  confidence  intervals, 
and  testing  of  hypotheses  using  bootstrap  based  critical  values. 

The  dissertation's  main  emphasis  is  on  comparing  the  performance  of  two 
bootstrap  methods:  bootstrapping  the  residuals  and  bootstrapping  the  data,  a 
method  suggested  by  Efron.  The  comparison  is  made  with  special  reference  to  the 
logit,  the  probit  and  the  tobit  models  and  also  in  obtaining  the  correct  significance 
levels  for  the  Wald,  likelihood  ratio  and  Lagrange  multiplier  tests.  Several  methods 
of  bootstrapping  the  residuals  have  been  investigated.  An  extension  is  also  made 
on  the  method  of  generating  short  bootstrap  confidence  intervals.  The  general 
conclusion  that  emerges  from  the  several  Monte  Carlo  experiments  conducted  is 
that  the  method  of  bootstrapping  the  residuals  is  to  be  preferred  to  the  method  of 
bootstrapping  the  data. 


CHAPTER  1 
INTRODUCTION 

The  bootstrap  method,  introduced  by  Efron  (1979),  is  a  resampling  method 
whereby  information  in  the  sample  data  is  "recycled"  for  the  purpose  of  inference. 
Resampling  methods  are  not  new.  The  jackknife,  introduced  by  Quenouille  (1956), 
is  one  of  the  resampling  methods  used  to  reduce  bias  and  provide  more  reliable 
standard  errors.  Unlike  the  jackknife,  the  bootstrap  resamples  at  random.  In  other 
words,  while  the  jackknife  systematically  deletes  a  fixed  number  of  observations 
in  order  (without  replacement),  the  bootstrap  randomly  picks  a  fixed  number  of 
observations  from  the  original  sample  with  replacement.  It  serves  not  only  of 
reducing  bias  and  providing  more  reliable  standard  errors,  but  also  giving  interval 
estimators  and  tests  of  hypotheses.  Though  the  jackknife  is  shown  in  Efron  (1979) 
to  be  a  linear  approximation  to  the  bootstrap,  the  bootstrap  method  is  more  widely 
applicable  than  the  jackknife, 

The  major  applications  of  the  bootstrap  method  are  point  estimation, 
interval  estimation,  and  tests  of  hypotheses.  These  are  not  three  individual  parts, 
but  an  interwoven  system  in  statistics.  There  is  an  enormous  statistical  literature 
on  bootstrap  confidence  intervals.  By  repeated  sampling,  the  bootstrap  can 
approximate  the  unknown  true  distribution  of  a  statistic  with  the  empirical 
bootstrap  cumulative  distribution.  Instead  of  getting  only  the  point  estimator  of 
a  parameter  as  with  a  regular  estimation  procedure,  we  can  obtain  the  interval 
estimator  of  the  parameter.  With  the  bootstrap  method,  we  do  not  just  have  a 
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point  estimator,  but  also  have  a  distribution  of  this  point  estimator  and  its 
confidence  interval.  To  get  a  better  bootstrap  point  estimator,  we  might  also  trim 
the  distribution  by  discarding  observations  that  lie  beyond  a  given  confidence 
interval.  Obviously  the  confidence  interval  must  be  one  with  good  features,  such 
as  one  with  a  relatively  good  coverage  or  short  interval.  The  point  estimator,  called 
the  bootstrap  trimmed  mean,  is  then  calculated  using  only  the  "un-trimmed" 
observations. 

Tests  of  hypotheses  are  another  major  application  of  the  bootstrap  method. 
With  the  bootstrap  approximation  of  the  unknown  true  distribution  of  a  test 
statistic,  we  might  have  an  accurate  true  significance  level  with  respect  to  the 
hypothesized  nominal  level.  To  improve  bootstrap  hypothesis  testing,  Hall  (1991) 
suggested  that  the  resampling  be  done  in  a  way  that  reflects  the  null  hypothesis, 
even  when  the  true  hypothesis  is  distant  from  the  null.  The  second  suggestion  was 
that  bootstrap  hypothesis  tests  should  employ  methods  that  are  already  known 
to  have  good  features  in  the  closely  related  problem  of  confidence  interval 
construction.  From  these  two  recommendations  we  can  see  that  bootstrap 
hypothesis  testing  is  very  closely  related  to  the  construction  of  a  bootstrap 
confidence  interval. 

There  are  two  ways  to  classify  bootstrap  methods.  The  first  classification  is 
that  we  can  either  bootstrap  residuals  of  a  model  or  bootstrap  the  data  itself.  To 
bootstrap  residuals  of  a  model,  we  may  first  estimate  the  model  and  get  the 
residuals,  then  randomly  draw  residuals  with  replacement  to  form  a  new  set  of 
bootstrap  residuals  of  the  same  sample  size.  Another  way  of  bootstrapping 
residuals  is  to  randomly  generate  a  new  set  of  bootstrap  residuals  from  the 
underlying  error  distribution.  To  bootstrap  the  data,  say  {y,,x,},  we  may  randomly 
draw  one  pair  from  the  set  at  a  time.  Repeatedly  drawing  with  replacement,  we  can 
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form  a  new  bootstrap  data  set  ly*,x'\  of  the  same  sample  size.  The  second 
classification  scheme  is  to  divide  bootstrapping  into  three  categories:  the 
parametric,  semiparametric,  and  nonparametric.  The  nonparametric  bootstrap 
does  not  depend  on  assumption.  If  we  try  to  estimate  a  tobit  model  using  the 
parametric  bootstrap  method,  we  first  estimate  the  model  with  the  original  sample 
(y,x)  to  get  an  estimate  p  of  the  true  parameter  (3.  Second,  for  each  bootstrapping, 
we  generate  an  error  term  u*  from  the  normal  distribution  with  mean  zero  and 
estimated  variance.  Third,  we  can  get  a  new  y*  from  the  model  using  x  and  p. 
Then  we  obtain,  from  the  y'  and  x,  the  bootstrap  point  estimate  p*.  This  bootstrap 
procedure  is  called  the  parametric  bootstrap.  The  semiparametric  bootstrap  is  any 
method  in  between  these  two  methods.  For  an  efficient  estimation  method  with  a 
correctly  specified  model,  the  best  bootstrap  method  is  the  parametric.  While 
perhaps  not  quite  as  good  as  the  parametric  bootstrap,  the  nonparametric 
bootstrap  method  is  relatively  better  when  we  use  an  efficient,  nonparametric 
estimation  method. 

As  Efron  (1990)  and  many  other  authors  pointed  out,  the  computational 
burden  of  the  bootstrap  simulations  for  reliable  bootstrap  confidence  intervals  is 
a  problem  even  with  today's  fast  machines.  Several  resampling  methods  to  reduce 
the  computational  burden  have  been  devised.  One  group  of  methods  tries  to 
reduce  the  required  number  of  bootstrap  replications  through  more  sophisticated 
resampling  schemes  such  as  balanced  sampling,  importance  sampling,  and 
antithetic  sampling.  All  of  these  methods  were  originally  developed  for  Monte  Carlo 
analysis  in  1960s.  Efron  (1990)  proposed  a  post  hoc  correction  method  to  reduce 
the  first-order  bias  in  bootstrap  estimates.  Another  line  of  research  has  focused 
on  analytical  approximations  of  bootstrap  estimation,  such  as  saddle  point 
approximations  by  Davison  and  Hinkley  (1988)  and  bc"  confidence  intervals  by 
Diciccio  and  Tibshirani  (1987). 
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For  large  samples,  we  have  well  developed  asymptotic  theory.  But  for  small 
samples,  it  is  very  difficult  to  find  the  approximate  cumulative  distribution 
function  of  a  statistic.  However,  with  the  bootstrap  method,  we  can  approximate 
an  exact  cumulative  distribution  function  of  a  statistic  for  any  sample  size. 

Since  the  bootstrap  is  only  a  technique  used  in  estimation,  the  goodness  of 
the  result  from  a  bootstrap  estimation  depends  not  only  on  the  method  of 
bootstrapping  but  also  on  the  estimation  method.  With  an  efficient  estimation 
method,  we  will  likely  have  an  efficient  bootstrap  estimation  method.  With  a  robust 
estimation  method,  we  will  probably  get  a  robust  bootstrap  estimation  method. 
With  a  sensitive  estimation  method,  such  as  the  maximum  likelihood  estimation 
method  for  the  tobit  model,  the  bootstrap  based  on  this  estimation  method  would 
still  be  sensitive.  While,  this  may  seem  a  disadvantage  of  the  bootstrap  method, 
it  is  not  a  major  drawback,  because  we  can  easily  change  to  another  estimation 
method  to  improve  the  result.  Therefore  with  bootstrapping,  to  obtain  satisfactory 
results,  we  need  to  apply  not  only  the  best  estimation  method,  but  also  the 
bootstrapping  method  that  works  the  best  with  the  selected  method  of  estimation. 

There  are  two  other  disadvantages  of  the  bootstrap  method.  Since 
bootstrapping  is  very  computer  intensive,  it  may  take  too  much  computer  time, 
especially  for  getting  bootstrap  confidence  intervals.  Second,  since  all  the 
bootstraps  depend  on  the  original  sample,  the  reliability  of  the  original  sample 
becomes  very  important. 

Since  the  first  paper  by  Efron  (1981)  on  bootstrapping  censored  data,  there 
have  been  few  papers  applying  bootstrap  methods  in  limited  dependent  variable 
models,  that  is,  regression  models  for  which  the  range  of  the  dependent  variable 
is  restricted  to  some  subset  of  the  real  line.  Efron  (1981)  proposed  a 
nonparametric   bootstrap  procedure  for  use  with   censored   data  where   the 
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estimation  uses  the  nonparametric  method  of  Kaplan-Meier  (1958)  and  concluded 
through  simulation  that  the  procedure  performs  reasonably  well  in  finite  samples. 

Teebagy  and  Chatterjee  (1989)  applied  Efron's  nonparametric  bootstrap 
method  with  MLE  to  the  logistic  regression  model.  They  got  satisfactory  results  in 
their  Monte  Carlo  study.  Manski  and  Thomson  (1986)  studied  the  use  of  bootstrap 
in  the  maximum  score  estimation  of  binary  response  variable  models.  Using  the 
same  bootstrap  procedure  as  Teebagy  and  Chatterjee  (1989),  they  found  that  the 
bootstrap  standard  errors  are  very  close  to  the  true  ones.  However,  Adkins  (1990) 
applied  one  of  the  parametric  bootstrap  methods  with  MLE  to  the  probit  model 
and  got  different  results  from  those  of  Teebagy-Chatterjee  in  that  the  bootstrap 
method  tends  to  be  quite  unreliable  in  small  samples.  The  conflict  between  these 
papers  leads  us  to  further  investigate  why  different  bootstrap  methods  provide 
conflicting  results  for  similar  models  and  what  the  effect  will  be  from 
bootstrapping  data  and  bootstrapping  residuals.  Indeed,  this  is  the  major 
motivation  of  this  dissertation. 

Flood  (1985)  applied  the  augmented  bootstrap  method  to  the  tobit  model 
and  found,  through  simulation,  that  the  augmented  bootstrap  gives  standard 
errors  that  are  close  to  the  true  values. 

These  three  studies  of  Teebagy-Chatterjee,  Adkins,  and  Flood  concentrate 
on  estimating  standard  errors  of  estimated  parameters  of  the  logit,  probit,  and 
tobit  models.  The  standard  error  is  then  used  to  form  confidence  intervals,  which 
have  already  been  well  provided  by  bootstrap  methods.  The  major  advantage  of  the 
bootstrap  method  is  to  form  reliable  confidence  intervals,  but  the  standard  interval 
generated  from  bootstrap  standard  errors  does  not  perform  as  well  as  the 
percentile  interval,  which  already  has  received  a  lot  of  criticism.  On  the  other 
hand,  Jeong  and  Maddala  (1992)  argued  that  only  comparing  the  bootstrap 


6 
standard  error  to  the  asymptotic  standard  error  is  not  appropriate  because  even 
if  the  two  of  them  agree,  there  can  be  a  large  difference  in  the  corresponding 
confidence  intervals  if  the  bootstrap  distribution  is  sufficiently  skewed. 

The  Wald,  likelihood  ratio,  and  Lagrange  multiplier  tests  are  the  most 
commonly  used  hypothesis  tests  in  limited  dependent  variable  models.  There  are 
two  major  problems  with  these  tests.  The  first  one  is  the  conflict  between  tests; 
one  test  may  reject  the  null  hypothesis,  but  another  may  fail  to  reject  the  null.  The 
second  problem  is  that  the  true  significance  level  is  frequently  larger  than  the 
nominal  level.  This  means  that  the  tests  over-reject  the  null  hypothesis.  These  two 
problems  arise  from  using  the  asymptotic  chi-square  critical  regions  as  an 
approximation.  If  we  can  find  approximately  true  distributions  of  those  test 
statistics,  which  is  the  advantage  of  the  bootstrap  method,  then  the  problems 
might  be  eliminated. 

Because  of  the  importance  of  bootstrap  confidence  intervals,  the  accuracy 
of  bootstrap  hypothesis  testing,  and  the  goodness  of  bootstrap  trimmed  estimates 
in  limited  dependent  variable  models,  it  is  hoped  that  this  work  will  motivate 
practicing  econometricians  to  consider  the  bootstrap  methods  in  their  analyses. 
It  is  also  hoped  that  this  dissertation  will  provide  an  idea  about  how  to  choose  a 
suitable  bootstrap  method  for  making  inferences  in  limited  dependent  variable 
models.  Finally,  this  dissertation  provides  comparisons  between  the  parametric 
bootstrap  method,  the  semiparametric  bootstrap  methods,  and  the  nonparametric 
bootstrap  method. 

In  this  dissertation,  an  extension  is  made  to  the  method  of  generating  the 
short  bootstrap  confidence  interval.  Also  methods  to  create  a  real  bootstrap 
confidence  interval  and  a  method  of  finding  the  bootstrap  trimmed  mean  are 
proposed.     Through     the     comparisons     between     bootstrapping     data     and 
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bootstrapping  residuals,  as  applied  to  estimate  the  logit,  probit,  and  tobit  models, 
some  guidance  is  provided  on  the  relative  merits  of  the  two  approaches.  As  an 
application  of  bootstrap  hypothesis  testing,  approximations  of  the  exact 
distributions  of  the  Wald,  likelihood  ratio,  and  Lagrange  multiplier  tests  are 
provided  through  bootstrapping. 

The  structure  of  the  dissertation  is  as  follows:  In  chapter  2,  a  quasi-pivotal 
method  for  generating  bootstrap  confidence  intervals  is  discussed  and  its 
application  to  the  method  of  generating  bootstrap  confidence  intervals  is 
presented.  With  bootstrap  confidence  intervals,  we  suggest  a  new  method  to  find 
the  bootstrap  trimmed  mean. 

In  chapter  3,  a  comparison  between  the  parametric  bootstrapping  residuals 
method  and  the  nonparametric  bootstrapping  data  method  is  studied  by  Monte 
Carlo  experiments  for  correctly  specified  logit  and  probit  models. 

In  chapter  4,  an  investigation  is  conducted  through  Monte  Carlo 
experiments  to  see  the  differences  between  the  parametric  bootstrap  method,  the 
semiparametric  bootstrap  method,  and  the  nonparametric  bootstrap  method 
under  correcUy  specified  tobit  models. 

In  chapter  5,  an  application  of  bootstrap  methods  to  approximate  the  exact 
distribution  of  the  Wald,  likelihood  ratio,  and  Lagrange  multiplier  test  statistics 
for  the  logit,  probit,  and  tobit  models  is  discussed  in  detail,  once  again  using 
Monte  Carlo  experiments.  At  the  same  time,  the  comparison  between  the 
parametric  bootstrapping  residuals  method  and  nonparametric  bootstrapping  data 
method  is  studied. 

The  final  chapter  presents  the  summary  and  conclusions. 


CHAPTER  2 

A  QUASI-PIVOTAL  METHOD  FOR  GENERATING 
BOOTSTRAP  CONFIDENCE  INTERVALS 


2.1    Introduction 

Instead  of  getting  a  point  estimator,  the  bootstrap  method  can  be  used  to 
give  us  an  interval  estimator.  In  addition  to  this,  we  can  also  get  an  empirical 
cumulative  distribution  function  of  the  estimator.  So  by  bootstrapping,  we  can 
gain  much  more  information  beyond  the  point  estimator  and  its  standard 
deviation.  Therefore  a  major  goal  in  refining  bootstrap  methods  is  to  generate 
better  bootstrap  confidence  intervals.  The  percentile  method,  which  takes  5%  off 
of  each  tail  when  we  try  to  form  a  90%  confidence  interval,  introduced  and 
developed  by  Efron  (1981,  1982,  1985,  1987),  gives  bootstrap  confidence  intervals 
for  parameters  of  interest.  Beran's  B  method  (1987)  gives  more  accurate  bootstrap 
confidence  sets  but  takes  too  much  computing  time,  requiring  1000x1000 
bootstraps  for  one  confidence  interval.  A  more  convenient  method,  which 
automatically  corrects  for  bias,  is  the  bootstrap-t. 

Hall  ( 1 992)  has  applied  the  bootstrap  method  to  generate  a  short  confidence 
interval.  He  provided  the  asymptotic  theory  and  concluded  that  the  accuracy  of  the 
coverage  for  the  short  bootstrap  confidence  interval  is  0(n  2)  and  the  accuracy  of 
its  length  is  Op[n7/2).  In  this  chapter,  we  extend  the  method  of  generating  the  short 
bootstrap  confidence  interval,  called  the  quasi-pivotal  method,  to  be  a  correction 
of  the  percentile  method,  Beran's  B,  the  bootstrap-t,  and  the  real  bootstrap 
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confidence  interval,  which  will  be  presented  in  section  2.3.  We  present  Monte 
Carlo  experiments  showing  that  the  performance  of  the  quasi-pivotal  method  is 
better  than  that  of  the  bootstrap  confidence  intervals  generated  by  the  other 
methods.  The  idea  behind  the  quasi-pivotal  method  comes  directly  from  the 
original  idea  of  the  pivotal  quantity  method,  which  minimizes  the  length  of  the 
bootstrap  confidence  interval  given  its  confidence  level. 

Leger  and  Romano  (1990)  consider  the  problem  of  using  the  bootstrap  to 
adaptively  choose  the  trimming  proportion  in  an  adaptive  trimmed  mean.  We 
suggest  a  new  method  to  find  a  bootstrap  trimmed  mean  by  trimming  the  sum  of 
squares  of  errors  of  a  regression  model. 

The  pivotal  and  the  quasi-pivotal  methods  will  be  presented  in  section  2.2. 
A  real  bootstrap  confidence  interval  for  a  parameter  will  be  given  in  section  2.3, 
and  comparisons  among  the  different  confidence  intervals,  which  are  generated 
by  different  bootstrap  methods,  are  discussed  in  section  2.4.  Finally,  we  will 
compare  the  methods  of  generating  bootstrap  confidence  intervals  to  see  the 
differences  between  bootstrapping  data  and  bootstrapping  residuals  in  section  2.5. 

2.2  The  Quasi-Pivotal  Method 

Let  Q(X,9)  be  some  function  of  the  random  variable  X  and  the  parameter  9 

such  that  the  distribution  of  Q  does  not  depend  on  9.  Then  Q(X,9)  is  called  a 

pivotal  quantity.  To  construct  a  confidence  interval  for  x(9)  at  level  l-2a  using  the 

pivotal  quantity  method,  we  need  to  find  a  pair  of  numbers  q!  and  c^,  such  that 

P{qi<0(X,9)<q2}  =  l-2a  (2.1) 

The  quantity  Q(X,9)  can  be  pivoted  in  the  sense  that  q,<Q(x,9)<q2  if  and  only  if 
T,(x)<T(8)<r2(x),  then  (T,(X),  T2(X))  is  a  100(l-2a)%  confidence  interval  for  x(9). 
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Here,  we  have  infinitely  many  choices  of  q,  and  q^,  but  we  should  select  that  pair 
of  q,  and  c^  that  will  make  T,(X)  and  T2(X)  close  together  in  some  sense.  For 
instance,  if  T^XJ-T^X),  which  is  the  length  of  the  confidence  interval,  is  not 
random,  then  we  might  select  that  pair  of  q]  and  q^  that  minimizes  Tyxj-T^X). 
This  is  what  we  use  in  our  Monte  Carlo  experiments.  Alternatively,  if  T2(X)-T,(X) 
is  random,  then  we  might  select  the  pair  of  q,  and  ^  that  minimizes  the  average 
length  of  the  interval.  If  Q(X,9)  has  a  symmetric  distribution,  like  the  standard 
normal  distribution,  then  q2=-q1,  and  the  resulting  interval  is  the  same  as  Efron's 
percentile  interval. 

Consider  the  random  variable  X  having  cumulative  distribution  function  H 
with  mean  p.  Let 

T  =     P"P     =  T(Z,B)  (2.2) 

SE(B) 

which  is  assumed  to  be  distributed  with  cumulative  distribution  function  F  and 

is  a  function  of  the  data  and  an  unknown  parameter,     p   is  an  estimator  of  p, 

and  z={X  X,  —  -  X  1.   Let   T(Z*,p)   be  an  estimator  of  T(Z,p).  Then,  by  using  the 

bootstrap  method,  the  empirical  cumulative  distribution  function  G  of  T(Z\B)   is 

an  estimate  of  the  cumulative  distribution  function  F  of  T(Z,p),  and   T(Z\B)   is 

actually  a  statistic.  Therefore,  to  give  a  short  bootstrap  confidence  interval,  we 

suggest  the  following  method,  which  we  call  a  quasi-pivotal  method.  And  this 

method  also  could  be  used  as  a  "correction  method"  for  the  percentile  method, 

Beran's  B,  the  boots trap-t,  and  the  real  bootstrap  confidence  interval  wherever  the 

percentile  method  is  applied. 

To  get  a  bootstrap  confidence  interval  of  T(Z,P)  at  level  l-2a,  assume  that 

the  statistic   T(Z*,P)   has  a  cumulative  distribution  function  G,  i.e. 
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p{q,<r(Z\p)<q2}  =  l-2a  <2-3) 


We  suggest  finding  q,  and  q^  by  minimizing  (qa-q,)  subject  to  G(q2)-G(q,)=l-2a  and 
q2>q,.  The  detailed  procedure  is  as  follows: 

Step  1 :  Draw  a  bootstrap  sample  z*(b)  from  z  of  size  n  with  replacement. 

Step  2:  Use  the  bootstrap  sample  z*(b)  to  get  p*(b),  the  standard  error  of  p',b) 

and  T(z*,p). 
Step  3:  Repeat  steps    1    to   2   B   times  to  get  an   empirical   cumulative 

distribution  function  G  of  the  statistic   T(Z',P)   f°r  sufficienUy  large 

B. 
Step  4:  Given  q,,  minimize  o^,  subject  to 

G(q2)  -  GiqJ  >  l-2a 

to  get  q    as  a  function  of  qr 
Step  5:  Minimize  /q  (q  )-q  )  with  respect  to  q,  to  get  q    then  q 

Step  6:  The  short  bootstrap  confidence  interval  of  T(Z,(3)  is   [q  ,q). 

Step  7:  Report  the  short  bootstrap  confidence  interval  of  p  at  level  of  l-2a 

to  be 


(p-q2-SE(P),    p-q,-SE(P)}  <2-4) 


Note  that,  if  T(z,p)  =  f(p),  where  f  is  a  linear  increasing  function,  of 
bootstrap-t  as  in  equation  (2.2),  then  the  length  of  the  short  bootstrap  confidence 
interval  of  P  is  also  minimized. 

Actually,  the  quasi-pivotal  method  always  gives  the  best  confidence 
intervals;  that  is,  by  best  we  mean  that,  it  is  closest  to  the  exact  confidence 
interval.  The  percentile  method  finds  G'Ha)  and  G'(l-a)  by  cutting  a-percent  from 
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each  tail.  Because  the  empirical  cumulative  bootstrap  distribution  is  usually  not 
symmetric,  the  percentile  method  does  not  perform  well.  Efron's  bias-corrected 
percentile  method  (known  as  the  BC  method)  corrects  the  bias  of  the  percentile 
method,  and  Efron's  accelerated  bias-corrected  percentile  (BCJ  method  (Efron 
1 987)  corrects  its  skewness. 

We  could  show  that,  under  the  assumption  of  a  single  peak  for  the  density 
function  of  the  empirical  cumulative  distribution  function,  the  quasi-pivotal 
method  contains  the  central  densest  part  of  the  empirical  cumulative  bootstrap 
distribution.  This  follows  from  the  first  order  condition,  as  will  be  shown  below. 
To  construct  a  confidence  interval  at  level  l-2a,  we  want  to  minimize  a^-qj  subject 
to 

G{q2)  -  GiqJ  =  l-2oc  (2.5) 

Let  G  and  g  be  the  cumulative  distribution  function  and  its  density  function  (i.e. 
g  is  the  derivative  function  of  the  cumulative  distribution  function  G).  Let  L  be  the 
Lagrange  function 


then 


L  =  q^qj+^l^a-IGtqJ-CHq,)]} 


i£=  -UXgiqJ  =0 

3q, 


dL 
W2 


^  =  1-Xg{q2)  =0 


thus 


fif(q.)  =  9(q2)  (2-6) 

which  means  that  the  interval  selected  by  the  quasi-pivotal  method  contains 
100(l-2a)%  of  the  central  densest  part  of  the  distribution  G. 
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Many  methods  partially  apply  the  percentile  method  to  generate  a  bootstrap 
confidence  interval,  for  instance,  the  bootstrap-t,  Beran's  B  and  the  real  bootstrap 
confidence  interval  (which  will  be  presented  in  next  section).  Both  the  quasi-pivotal 
method  and  the  percentile  method  directly  find  two  end  points  of  a  confidence 
interval.  As  we  have  shown  above  the  quasi-pivotal  method  is  the  best  one  among 
those  methods  of  generating  bootstrap  confidence  intervals. 

The  quasi-pivotal  method  is  also  our  suggested  correction  method.  To  take 
one  example,  we  can  correct  the  bootstrap-t  by  using  the  quasi-pivotal  method 
instead  of  using  the  percentile  method  when  generating  a  bootstrap  confidence 
interval.  To  get  a  bootstrap  confidence  interval  of  T(Z,(3)  at  level  l-2oc,  by  using  the 
quasi-pivotal  method,  the  short  bootstrap  confidence  interval  is  given  as  (q^qj 
where  q,  and  q%  are  found  by  following  step  1  to  step  4  described  earlier.  When 
using  the  percentile  method,  the  bootstrap  confidence  interval  is  given  as 
(G'HoO.G'U-a)),  where  G  is  the  empirical  cumulative  bootstrap  distribution 
function.  Unless  the  empirical  cumulative  distribution  function  is  symmetric,  the 
corrected  (by  the  quasi-pivotal  method)  bootstrap-t  method  is  always  better  than 
the  bootstrap-t  method  because  it  automatically  corrects  for  skewness  whereas  the 
bootstrap-t  does  not.  In  the  case  of  symmetry,  these  two  methods  are  equivalent, 
because  of  the  equivalency  of  the  quasi-pivotal  method  and  the  percentile  method, 
and  so  no  correction  is  needed. 

In  this  way,  Beran's  B  method,  the  real  bootstrap  confidence  interval  and 
even  the  percentile  method  itself  could  be  corrected  by  the  quasi-pivotal  method 
as  long  as  the  true  distribution  of  T(Z,(3)  is  asymmetric.  In  practice,  especially  for 
small  samples,  bootstrap  distributions  are  skewed.  Therefore  the  quasi-pivotal 
method  should  be  used  either  directly  or  as  a  correction  method. 
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2.3   Real  Bootstrap  Confidence  Intervals 

In  inference  theory,  if  we  build  a  confidence  interval,  say  (a,b),  for  the  true 
parameter  p  at  level  a,  we  interpret  it  as  meaning  that  with  repeated  sampling, 
100(l-oc)%  of  our  confidence  intervals  would  contain  the  true  parameter  p.  It 
would  be  very  costly,  and  sometimes  impossible,  to  do  the  repeated  sampling. 
Therefore  analysts  usually  get  a  confidence  interval  (a,b)  and  say  that  with 
100(l-a)%  confidence,  the  true  parameter  p  fall  in  this  interval.  But  things  are 
different  if  we  bootstrap,  since  we  are  actually  doing  repeated  sampling  (or 
resampling).  So  the  bootstrap  method  may  give  us  a  closer  to  the  exact  (or  true) 
confidence  interval  than  we  usually  have. 

It  can  be  argued  that  the  percentile  interval  given  by  Efron  (1981)  is  not 
really  a  confidence  interval,  since  it  gives  central  90%  populations  of  the  empirical 
cumulative  distribution  function  of  bootstrap  estimates  of  the  true  parameter  p, 
which  also  makes  it  hard  to  draw  reasonable  inferences  about  p  based  on  just  the 
bootstrap  distribution. 

To  have  a  more  meaningful  confidence  interval,  let  us  consider  the  following 
bootstrap  procedure: 

Step  1:  From   a   bootstrap   sample  x'{b) ,  we   can   estimate   the   quantity 

T(X*(b,,p),   which  is 

T(X*(b),p)  =     P"b'"P  (2.7) 

SE(p',b)) 

Step  2:  From   bootstrapping,    we   can   obtain   the   empirical   cumulative 

distribution  function,  called  G,   of  T(X*,W,P).  Then  for  the  b-th 

bootstrap  sample,  the  real  confidence  interval  for  p  at  level  l-2a  is 

{p<(«_G-i(1_a).SE(p.(b))i  p-M-G-HoO-SEtp*1*")}  (2-8) 
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Step  3:  Repeat  steps  1  to  2  B  times.  Then  the  averaged  bootstrap  confidence 

interval,  which  is  called  the  real  bootstrap  confidence  interval,  at 
level  1  -2a  is 


(3 •-G"'(1"a)V^  SE(p *lb)),  p'-G~'(a)V;  SE(p *(b)) 
B       b-i  B     b-i 


(2.9) 


The    corrected    (by    the    quasi-pivotal    method)    real    bootstrap 
confidence  interval  at  level  1  -2a  would  be 


p'-5l  £SE(p*|b|),  p*--2i  £SE(p<,b|) 
B  b=i  B  ],,\ 


(2.10) 


The  method  we  describe  above,  with  the  advantage  of  the  bootstrap  method, 
gives  us  a  real  confidence  interval1  for  the  true  parameter  p. 

2.4  Comparisons  of  the  Confidence  Intervals 
Consider  the  random  variables  Xlt  X^  •  •  •,  Xn,  generated  from  a 
cumulative  distribution  function  F  with  mean  p.  We  are  interested  in  getting  a 
confidence  interval  for  p  by  using  bootstrap  methods.  Of  course,  p  p  the 
estimator  of  p,  would  be  x.  We  can  use  this  simple  model,  to  compare  the 
methods  of  generating  bootstrap  confidence  intervals. 

To  detect  the  bias  and  skewness  of  the  different  bootstrap  confidence 
intervals,  we  need  to  generate  data  sets  from  skewed  distributions  with  different 
skewness.  We  generate  a  random  variable  X  to  form  four  data  sets  from  the 


1      The  interval  is  "real"  in  the  sense  that  it  means  the  usual  confidence  interval 
instead  of  the  bootstrap  confidence  interval. 
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Gamma  distribution  with  p(X<P)=0.560  and  X~r(r=5,Jl=20) ;  from  the  Chi-Square 
distribution  with  degree  of  freedom  1,  P(X<p)=0.542  and  X~r(r=10,A,=10);  from 
the  exponential  distribution  with  mean  2,  P(X<p)=0.530  and  X~r(r=20,?i=10); 
and  from  the  standard  normal  distribution  with  P(X<p)  =0.500  and 
X~iV(0,  0.05).  For  the  Gamma  distribution,  the  skewness  depends  inversely  on 
the  parameter  r  (figure  2. 1).  These  distributions  have  different  shapes;  we  want  to 
examine  if  the  quasi-pivotal  method  always  gives  the  confidence  interval  closest 
to  the  true. 

Let  (QL,  QR)  be  a  confidence  interval.  We  define  two  criteria 

RL/LL*   ®R~X  (2.11) 

and 

RC/LC  =  P(X*T*Qb)  (212) 

I\QL<.T±X) 

With  respect  to  the  sample  mean,  RL/LL  is  the  ratio  of  the  right  length  to  the  left 
length.  The  right  (left)  length  means  the  distance  between  the  sample  mean  and 
the  right  (left)  end  of  the  bootstrap  confidence  interval.  This  criterion  reflects  the 
shape  due  to  the  bias  of  the  empirical  cumulative  distribution.  Thus,  if  this 
quantity  is  very  close  to  the  exact  value  then  the  method  corrects  the  bias 
satisfactorily.  Efron  and  Tibshirani  (1986)  used  this  RL/LL  criterion  to  compare 
bootstrap  confidence  intervals.  RC/LC  is  the  ratio  of  the  right  coverage  to  the  left 
coverage.  The  right  (left)  coverage  means  the  probability  that  the  true  parameter 
p  falls  between  the  sample  mean  and  the  right  (left)  end  of  the  bootstrap 
confidence  interval.  This  criterion  reflects  the  shape  and  skewness  of  the  empirical 
cumulative  distribution,  which  implies  that  if  this  quantity  is  very  close  to  the 
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exact  value  then  the  method  corrects  the  shape  and  skewness  satisfactorily. 
These  two  criteria  would  give  us  a  good  description  of  the  shape  of  the  empirical 
cumulative  distribution.  The  third  criterion  we  use  in  these  Monte  Carlo 
experiments  is  the  length  of  the  bootstrap  confidence  interval. 

Along  with  decreasing  the  true  skewness  of  the  underlying  distribution  of 
X  to  zero,  we  construct  four  tables  by  generating  bootstrap  confidence  intervals 
using  the  different  methods.  Each  table  gives  us  the  average  results  of  500 
replications  with  1000  bootstraps  of  sample  size  20  for  each  replication.  We  have 
nine  different  methods  of  constructing  bootstrap  confidence  intervals.  The  exact 
confidence  interval  (EXACT)  is  generated  by  the  pivotal  quantity  method  directly 
from  the  true  distribution  of  x.  The  percentile  method  (PC),  the  quasi-pivotal 
method  (QP),  the  bias-corrected  percentile  method  (BC),  the  accelerated  bias- 
corrected  percentile  method  (BCJ,  the  real  bootstrap  confidence  interval  (RCI),  the 
corrected  (by  the  quasi-pivotal  method)  real  bootstrap  confidence  interval  (CRCI), 
the  bootstrap-t  method  (BST)  and  the  corrected  (by  the  quasi-pivotal  method) 
bootstrap-t  method  (CBST)  generate  bootstrap  confidence  intervals  from  the 
empirical  cumulative  distribution  of  the  estimator  of  mean  p\  With  500 
replications,  we  give  the  averaged  bootstrap  confidence  intervals  in  tables  2.1  to 
2.4.  Of  the  three  criterion  columns  in  tables  2.1  through  2.4,  we  use  the  results 
from  the  exact  confidence  interval  as  an  index  for  comparisons. 

As  we  can  see  from  tables  2.1  through  2.4,  the  quasi-pivotal  corrected 
methods,  of  PC,  RCI  and  BST,  which  are  labeled  QP,  CRCI  and  CBST  in  the  tables, 
provide  better  estimates  of  the  exact  confidence  intervals  of  the  mean  (3  from  the 
true  distribution  than  do  the  uncorrected  methods.  And  the  quasi-pivotal  method 
gives  the  best  result  if  the  true  distribution  is  skewed,  and  always  provides  the 
shortest  confidence  intervals.  Looking  at  the  two  corrected  methods,  CRCI  and 


18 
CBST  in  tables  2.1  to  2.3,  have  small  negative  effects  on  RC/LC,  but  greatly 
improve  RL/LL  to  yield  an  overall  improvement  over  the  uncorrected  methods 
shown  in  table  2.5.  The  corrected  real  bootstrap  confidence  interval  becomes  the 
closest  to  the  true  when  X  is  generated  from  a  symmetric  normal  distribution 
shown  in  table  2.4.  The  BC  (bias-corrected  percentile  method)  and  BCa 
(accelerated  bias-corrected  percentile  method),  reported  in  tables  2.1  through  2.5, 
do  not  perform  as  well  as  expected. 

Adding  three  criterion  columns  for  each  method  then  dividing  by  three  for 
each  sample  distribution,  we  get  the  first  four  columns  of  table  2.5  corresponding 
to  tables  2.1  to  2.4.  The  average  values  of  these  first  four  columns  in  table  2.5 
form  the  last  column  in  the  table.  With  decreasing  skewness  from  column  1  to 
column  4,  all  of  the  entries  for  each  method  are  decreasing,  which  means  that  the 
less  the  skewness,  the  closer  are  the  results  to  the  exact  confidence  intervals  for 
each  method.  When  the  underlying  distribution  is  symmetric,  the  maximum 
underestimate  or  overestimate  is  only  about  2.4%.  It  is  shown  in  the  last  column 
of  table  2.5  that  the  quasi-pivotal  method  gets  the  best  results,  and  all  of  the 
corrected  methods  perform  better  than  the  uncorrected  methods. 

2.5  Comparisons  of  the  Bootstrap  Methods 
To  compare  the  methods  of  generating  bootstrap  confidence  intervals  by 
investigating  the  difference  in  effects  from  bootstrapping  data  and  bootstrapping 
residuals,  we  consider  the  following  linear  regression 

y,  =  Po+P.vP^^  (2-13) 

where  u,  follows  the  standard  normal  distribution.  Both  exogenous  variables  x, 
and  X2  are  continuous  and  randomly  drawn  from  the  uniform  distribution  of  (-2,2) 
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with  sample  size  40.  The  true  parameter  values  are  po=0.4,  P,=  1.0,  and  P2=1.0.  We 
estimate  the  model  (2.13)  with  the  nonparametric  estimation  method  of  ordinary 
least  square  (OLS).  Three  different  bootstrap  methods  are  applied:  one  is  Efron's 
nonparametric  bootstrapping  data  method;  the  second  is  the  semiparametric 
bootstrapping  residuals  method;  and  the  third  is  the  parametric  bootstrapping 
residuals  method. 

We  first  generate  two  exogenous  variables  from  U(-2,2),  then  generate 
errors.  The  procedure  for  generating  the  errors  and  estimating  the  parameters  of 
this  linear  model  by  Efron's  nonparametric  bootstrap  method,  which  bootstraps 
the  data,  is  as  follows: 

Step  1:  Generate  two  exogenous  variables  from  U(-2,2)  and  errors  {u,|  from 

the  standard  normal  distribution,  then  get  |y,|  according  to  equation 
(2.13)  to  create  the  sample  (Y,X)={(ylt;q),—  ,[yn,x^}.  Then  get  the 
estimate  A    of  the  true  (5  by  ordinary  least  square  (OLS). 
Step  2:  Bootstrap  the  sample  (Y,X)  in  pairs  by  repeatedly  randomly  picking 

n  pairs  of  {(y.x,))  with  replacement  to  form  a  new  bootstrap  sample 

(y*,X-)  =  {(y;,x;),-,(y;x^)}. 
Step  3:  Estimate  this  linear  regression  model  by  OLS  with  this  bootstrap 

sample  (Y*,X*)  to  get  the  bootstrap  estimate  P*. 
Step  4:  Repeat  step  2  to  step  3  B=500  times. 

Step  5:  Find  the  mean  and  bootstrap  trimmed  mean  of  estimates,  mean  sum 

of  squared  residuals  of  Y  (MSSRY),  bootstrap  confidence  interval,  and 

other  criteria. 
Step  6:  Repeat  step  1  through  step  5  M=500  times  (this  is  the  super  loop)  to 

obtain  the  averages  of  the  bootstrap  estimates. 
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The  procedure  for  generating  y,  and  estimating  the  parameters  of  the  linear 
regression  model  by  the  semiparametric  bootstrapping  residuals  method  is 
different  only  for  step  2  to  step  3  from  the  previous  method: 
Step  2:  Find  the  residuals  im   according  to 

u^-oviVu+iw  (2-14) 

In  each  bootstrap  replication,  bootstrap  /q\  to  get  ly^X    Then  get  jy'l 
according  to   equation   (2.13)   to  have  a  new  bootstrap   sample 

Step  3:  Estimate  the  model  by  OLS  with  this  bootstrap  sample  (Y*,X)  to  get 

the  bootstrap  estimate  p*. 
The  procedure  for  generating  the  errors  and  estimating  the  parameters  of 
this  linear  regression  model  by  the  parametric  bootstrap  method,   (this  is  a 
bootstrapping  residuals  method),  is  different  only  for  step  2  to  step  3  from  the 
method  of  bootstrapping  data: 
Step  2:  Generate  errors  ju,*}  from  the  normal  distribution  with  estimated 

variance,  then  get  /y*|  according  to  equation  (2.13)  to  have  a  new 

bootstrap  sample  (Y*,X). 
Step  3:  Estimate  the  model  by  OLS  with  this  bootstrap  sample  (Y*,X)  to  get 

the  bootstrap  estimate  (3*. 
The  purpose  of  this  Monte  Carlo  study  is  to  study  the  differences  between 
the  two  bootstrap  methods,  bootstrapping  data  and  bootstrapping  residuals.  To 
serve  this  purpose,  we  first  discuss  the  bootstrap  trimmed  mean  and  other 
criteria. 

Leger  and  Romano  (1990)  consider  the  use  of  the  bootstrap  to  adaptively 
choose  the  trimming  proportion  in  an  adaptive  trimmed  mean.  They  calculated  a 
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symmetric  trimmed  mean  by  using  the  percentile  confidence  interval,  i.e.  they 
found  mean  of  X's  only  for  those  inside  the  percentile  confidence  interval  of  p, 
which  is  the  true  mean  of  X.  Their  a-trimmed  mean  is  defined  as  follows: 

n-\na] 

T„- (n-2lnal)-'     E    *<„  (2"15) 

Mna|*l 

where  ae  [0,*/2),  [•]  is  the  greatest  integer  function,  and  X,,,,  X,2))...  Xjn)  are  the  order 
statistics.  This  method  will  be  applied  to  {po'b1},  {pl(b,|,  an<^  |p2<b)}  separately  as 
"Trim  II"  in  our  Monte  Carlo  experiment  presented  in  table  2.6. 

We  also  use  a  new  method  to  find  a  bootstrap  trimmed  mean  by  trimming 
the  sum  of  squares  of  errors  of  a  regression  model.  For  500  bootstraps,  we  have 
500  estimates  ip*(b,|.  We  put  them  back  in  one  model  with  the  data  (Y,X)  that  is 
formed  from  all  500  bootstrap  samples.  Then,  we  apply  the  above  trim  II  method 
to  trim  {sSE*,b)} .  If  a  SSE'(b)  has  been  chosen,  then  the  corresponding  p*(b|  will  be 
chosen.  The  last  step  is  to  find  the  means  of  those  chosen  p*'s.  This  method  is 
called  "Trim  I"  in  table  2.6  and  table  2.7.  The  advantage  of  this  trimming  method 
is  that  it  keeps  p*'s  in  pairs,  trims  them  off  in  pairs,  and  averages  them  in  pairs. 

Since  the  errors  are  generated  from  the  normal  distribution,  the  true 
confidence  interval  for  p  should  be  symmetric.  That  is  to  say,  a  correction  for  bias 
or  skewness  is  not  needed.  Thus  we  consider  only  the  percentile  method  for 
generating  bootstrap  confidence  intervals  in  this  Monte  Carlo  experiment. 

The  fact  that  the  true  confidence  interval  for  p  is  symmetric.  Means  that, 
the  true  RC/LC=1  and  the  true  RL/LL=1.  We  calculate  the  TRIM  I  RC/LC  (RL/LL) 
using  the  trim  I  mean  (ref  eqs.  (2.1 1)  and  (2.12)),  by  taking  the  sum  of  squared 
differences  between  the  bootstrap  RC/LC  (RL/LL)  and  1  over  all  500  replications. 
BOOT  RC/LC  (RL/LL)  is  calculated  similarly,  by  taking  the  sum  of  squared 
differences  between  the  bootstrap  RC/LC  (RL/LL)  and  1  over  all  500  replications. 
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Note  that  the  TRIM  I  (or  BOOT)  RC/LC  (or  RL/LL)  is  different  from  RC/LC  (or 
RL/LL)  in  tables  2.1  to  2.5.  The  coverage  (that  is,  how  many  times  does  the 
calculated  confidence  interval  include  the  true  coefficient)  will  be  our  first  priority 
criterion  for  a  bootstrap  confidence  interval.  The  TRIM  I  (or  BOOT)  RC/LC  will  be 
the  second,  and  the  TRIM  I  (or  BOOT)  RL/LL  will  be  the  third.  The  smaller  of  the 
last  two  criteria  gives  the  better  bootstrap  method. 

In  table  2.6,  the  "OLS"  is  the  averaged  estimates  of  p  over  500  replications. 
Also  in  the  table,  the  "BOOTSTRAP",  'TRIM  I",  and  "TRIM  II"  are  the  bootstrap 
mean,  trimming  I  mean,  and  trimming  II  mean  respectively  over  500  replications. 

When  it  comes  to  point  estimates,  comparing  these  three  bootstrap 
methods,  we  can  see  from  table  2.6  that  Efron's  nonparametric  bootstrapping  data 
method,  the  semiparametric  bootstrapping  residuals  method,  and  the  parametric 
bootstrap  method  are  almost  equivalent,  with  the  first  one  having  the  closest 
MSSRY  to  the  true  value.  The  bootstrap  mean,  trimming  I  mean,  and  trimming  II 
mean  are  all  equivalent  within  each  bootstrap  method. 

For  obtaining  the  90%  confidence  interval,  comparing  these  three  bootstrap 
methods,  we  can  see  from  table  2.7  that  the  semiparametric  bootstrapping 
residuals  method  does  not  give  good  coverage  on  p0.  Efron's  nonparametric 
bootstrapping  data  method  gives  low  coverage  and  a  large  sum  of  squared 
differences.  The  parametric  bootstrap  method  gives  a  slighUy  low  coverage,  and 
almost  the  smallest  sum  of  squared  differences.  Therefore  the  parametric 
bootstrap  method  performs  better  than  Efron's  nonparametric  bootstrap  method 
in  this  Monte  Carlo  experiment. 
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2.6   Summary 

The  quasi-pivotal  method  suggested  in  this  paper  gives  the  best  result 
theoretically  and  empirically  if  the  true  distribution  is  skewed.  If  the  true 
distribution  is  symmetric,  theoretically,  the  quasi-pivotal  method  is  equivalent  to 
the  percentile  method.  So  this  bootstrap  method  performs  satisfactorily  for  any 
underlying  true  distribution.  Using  this  method  as  a  correction  method  yields 
results  that  are  much  better  than  those  provided  by  the  uncorrected  method. 

In  the  case  of  symmetry,  the  corrected  real  bootstrap  confidence  interval 
gives  the  closest  result  to  the  true  in  our  Monte  Carlo  experiments.  In  reality, 
however,  most  distributions  are  asymmetric,  especially  for  small  samples. 
Therefore,  bootstrap  confidence  intervals,  which  are  most  useful  in  small  samples, 
should  be  generated  by  either  the  quasi-pivotal  method  or  a  method  corrected  by 
the  quasi-pivotal  method. 

Under  the  correct  specification,  the  parametric  bootstrapping  residuals 
method  for  generating  bootstrap  confidence  intervals  and  model  estimation 
performs  satisfactorily  in  our  Monte  Carlo  experiments.  If  the  model  is 
misspecified,  we  would  expect  that  Efron's  nonparametric  bootstrap  method  to 
perform  better  and  the  parametric  bootstrap  method  to  perform  worse.  In  the  case 
of  symmetry,  the  bootstrap  mean,  the  proposed  trimmed  I  mean,  and  trimmed  II 
mean  are  all  equivalent  within  each  bootstrap  method.  Finally,  for  asymmetric 
distributions,  the  trimmed  means  are  expected  to  perform  better. 
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Table  2.1:       X  generated  from  r(r=0.25A=l),   p(X<(3)=0.560,   sample  size  is  20 
with  1000  times  bootstrapping  for  500  replications. 


METHOD 

INTERVAL 

LENGTH 

RC/LC 

RL/LL 

EXACT 

0.075, 

0.417 

1.000 

1.000 

1.000 

PC 

0.140, 

0.457 

0.927 

1.331 

1.238 

QP 

0.128, 

0.432 

0.888 

1.232 

1.014 

BC 

0.150, 

0.476 

0.956 

1.414 

1.458 

BCa 

0.161, 

0.515 

1.035 

1.544 

1.876 

RCI 

0.161, 

0.638 

1.395 

1.340 

2.640 

CRCI 

0.133, 

0.563 

1.257 

1.509 

1.731 

BST 

0.150, 

0.675 

1.535 

1.331 

2.643 

CBST 

0.120, 

0.593 

1.383 

1.498 

1.732 
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Table  2.2:  X,  p(X<B)=0.542,  generated  from  Chi-square  distribution  with  1 
degree  of  freedom,  sample  size  is  20  with  1000  times  bootstrapping 
for  500  replications. 


METHOD 

INTERVAL 

LENGTH 

RC/LC 

RL/LL 

EXACT 

0.488, 

1.493 

1.000 

1.000 

1.000 

PC 

0.526, 

1.492 

0.925 

1.184 

1.195 

QP 

0.529, 

1.433 

0.900 

1.103 

0.996 

BC 

0.590, 

1.541 

0.946 

1.256 

1.400 

BCa 

0.622, 

1.628 

1.001 

1.357 

1.747 

RCI 

0.614, 

1.832 

1.211 

1.183 

2.220 

CRCI 

0.539, 

1.656 

1.111 

1.327 

1.468 

BST 

0.590, 

1.900 

1.304 

1.184 

2.229 

CBST 

0.509, 

1.710 

1.195 

1.328 

1.475 
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Table  2.3: 

x.   P(X<P)=0.530,   generated  from  exponential  distribution  with 
mean=2,  sample  size  is  20  with  1000  times  bootstrapping  for  500 
replications. 

METHOD 

INTERVAL 

LENGTH 

RC/LC 

RL/LL 

EXACT 

1.270,  2.716 

1.000 

1.000 

1.000 

PC 

1.378,  2.771 

0.963 

1.123 

1.136 

QP 

1.344,  2.710 

0.945 

1.073 

1.004 

BC 

1.412,  2.824 

0.977 

1.170 

1.276 

BCa 

1.452,  2.914 

1.011 

1.241 

1.513 

RCI 

1.436,  3.063 

1.125 

1.126 

1.708 

CRCI 

1.358,  2.904 

1.069 

1.226 

1.292 

BST 

1.404,  3.125 

1.190 

1.123 

1.709 

CBST 

1.321,  2.956 

1.131 

1.222 

1.293 
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Table  2.4:       X  generated  from  N(0,1),   P(X<(3)=0,500,   sample  size  is  20  with 
1000  times  bootstrapping  for  500  replications. 


METHOD 

INTERVAL 

LENGTH 

RC/LC 

RL/LL 

EXACT 

-0.368,  0.368 

1.000 

1.000 

1.000 

PC 

-0.357,  0.352 

0.963 

1.010 

0.998 

QP 

-0.351,  0.349 

0.951 

1.014 

1.008 

BC 

-0.360,  0.350 

0.965 

1.002 

0.989 

BCa 

-0.362,  0.350 

0.967 

0.999 

0.988 

RCI 

-0.376,  0.367 

1.010 

1.015 

0.996 

CRCI 

-0.369,  0.363 

0.995 

1.011 

1.001 

BST 

-0.391,  0.381 

1.049 

1.010 

0.997 

CBST 

-0.383,  0.377 

1.034 

1.008 

1.001 
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Table  2.5:       The  average  of  the  absolute  differences  from  the  exact  results  from 
the  columns  of  LENGTH,  RC/LC  and  RL/LL  in  previous  tables. 


METHOD 

TABLE  1 

TABLE  2 

TABLE  3 

TABLE  4 

TOTAL 

EXACT 

1.000 

1.000 

1.000 

1.000 

1.000 

PC 

1.214 

1.151 

1.099 

1.013 

1.119 

QP 

1.119 

1.069 

1.044 

1.024 

1.064 

BC 

1.305 

1.237 

1.156 

1.016 

1.179 

BCa 

1.485 

1.368 

1.255 

1.015 

1.281 

RCI 

1.792 

1.538 

1.320 

1.010 

1.415 

CRCI 

1.499 

1.302 

1.196 

1.006 

1.251 

BST 

1.836 

1.572 

1.341 

1.021 

1.443 

CBST 

1.538 

1.333 

1.215 

1.014 

1.275 
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Table  2.6:       Estimates  of  the  parameters  and  mean  squared  errors  of  the  linear 
regression  model  with  B=500,  N=40  and  M=500. 


METHOD 

Po 

Pi 

P2 

MSSRY 

TRUE 
OLS 

0.400 
0.401 

1.000 
0.989 

1.000 
1.006 

1.000 
0.987 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

BOOTSTRAP 

0.401 

0.989 

1.005 

0.988 

TRIM  I 

0.401 

0.989 

1.005 

0.988 

TRIM  II 

0.401 

0.989 

1.005 

0.988 

SEMIPARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

BOOTSTRAP 

0.401 

0.989 

1.006 

1.043 

TRIM  I 

0.401 

0.989 

1.006 

1.043 

TRIM  II 

0.401 

0.989 

1.006 

1.043 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

BOOTSTRAP 

0.402 

0.989 

1.005 

1.119 

TRIM  I 

0.402 

0.989 

1.005 

1.119 

TRIM  II 

0.402 

0.989 

1.005 

1.119 
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Table  2.7:       Estimates  of  the  parameters  and  mean  squared  errors  of  the  linear 
regression  model  with  B=500,  N=40  and  M=500. 

CRITERIA po Pi P2 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 


INTERVAL 

0.141,  0.661 

0.765,  1.214 

0.776,  1.234 

COVERAGE 

0.892 

0.876 

0.876 

TRIM  I  RC/LC 

2.194 

2.099 

2.198 

TRIM  I  RL/LL 

3.664 

5.594 

5.417 

BOOT  RC/LC 

1.949 

2.472 

2.838 

BOOT  RL/LL 

2.217 

3.270 

3.089 

SEMIPARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

INTERVAL 

0.143,  0.660 

0.764,  1.214 

0.779,  1.232 

COVERAGE 

0.604 

0.900 

0.896 

TRIM  I  RC/LC 

2.165 

1.887 

1.873 

TRIM  I  RL/LL 

3.432 

2.795 

2.661 

BOOT  RC/LC 

2.118 

1.933 

1.879 

BOOT  RL/LL 

2.163 

2.036 

1.971 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

INTERVAL 

0.140,  0.664 

0.760,  1.218 

0.776,  1.236 

COVERAGE 

0.876 

0.874 

0.878 

TRIM  I  RC/LC 

1.901 

1.966 

2.296 

TRIM  I  RL/LL 

2.688 

2.678 

2.740 

BOOT  RC/LC 

1.712 

1.777 

2.168 

BOOT  RL/LL 

2.139 

2.056 

2.166 
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distl:    3f~P(r=5,;i=20)    and  p(xz$)  =0  .  560  with  mean=0.25. 
dist2:    x~T(r=10,X=10)    and  p(xz$)  =o  .  542  with  mean=l. 
dist3:    x~V(i =20,1=10)    and  p(x±$)  =q  .  530  with  mean=2. 


Figure  2.1:  Three  gamma  distributions  of  x . 


CHAPTER  3 

BOOTSTRAP  METHODS  IN  BINARY 
RESPONSE  VARIABLE  MODELS 


3.1    Introduction 

During  recent  years  the  bootstrap  method  has  been  applied  in  many 
econometric  applications.  A  survey  paper  by  Jeong  and  Maddala  (1992),  however, 
points  out  that  there  are  several  questions  left  in  the  area  that  need  to  be  studied 
further.  This  chapter  discusses  the  differences  between  bootstrapping  data  and 
bootstrapping  residuals  in  binary  response  variable  models  of  logit  and  probit.  For 
models  with  limited  dependent  variables,  simple  bootstrap  methods  fail  to  keep  the 
censoring  properties  of  the  model.  There  are  different  ways  to  modify  the  bootstrap 
method  to  avoid  the  flaw.  For  example,  Efron  (1981)  proposed  a  bootstrap  method 
for  censored  data  which  keeps  the  properties  of  censoring.  For  binary  response 
variable  models,  we  compare  four  modifications  in  this  chapter. 

The  first  one  is  bootstrapping  data,  which  is  a  nonparametric  bootstrap 
method.  Teebagy  and  Chatterjee  (1989)  apply  Efron's  bootstrapping  data 
procedure  (Efron  1981)  to  the  logistic  regression  model.  The  Monte  Carlo  study 
they  conducted  shows  that  the  results  are  satisfactory.  The  rest  of  the  methods 
depend  on  bootstrapping  residuals.  Adkins  (1990)  estimated  bootstrap  standard 
errors  using  a  parametric  bootstrapping  residuals  method  in  a  probit  model,  got 
unstable  results,  and  argued  that  the  bootstrap  method  is  not  superior  to  MLE  for 
the  probit  model.  The  next  method  we  consider  is  to  use  a  parametric  bootstrap 
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method  by  generating  errors  from  the  underlying  distribution  for  each  bootstrap 
iteration.  The  last  method  we  consider,  which  turns  out  to  be  unsuccessful,  is  to 
bootstrap  generalized  residuals  to  estimate  the  binary  response  variable  models. 
The  concept  of  generalized  residuals  deserves  some  explanation.  Since  the 
first  definition  of  the  generalized  residual  by  Cox  and  Snell  (1968),  the  generalized 
residual  has  been  applied  to  several  limited  dependent  variable  models.  Lancaster 
(1985ab)  defined  generalized  residuals  and  featured  them  in  diagnostic  statistics 
to  detect  omitted  covariates  and  neglected  heterogeneity  in  duration  models. 
Chesher  and  Irish  (1987)  applied  graphical  and  numerical  analysis  of  residuals  to 
censored  data.  Gourieroux  et  al.  (1987a)  proposed  a  new  definition  of  generalized 
residuals  that  can  be  used  in  a  fairly  general  context,  especially  in  limited 
dependent  variable  models.  In  this  chapter,  we  apply  the  Gourieroux  et  al. 
definition  of  the  generalized  residual  to  binary  response  variable  models.  The 
generalized  residual  bootstrap  method  is  a  semiparametric  method,  and  the  model 
is  estimated  by  solving  a  nonlinear  equation  system  instead  of  using  the  maximum 
likelihood  estimation  method. 

3.2  Bootstrap  Methods  in  Binary  Response  Variable  Models 
As  we  know,  the  logit  and  probit  models  are  approximately  equivalent 
(Maddala  1983,  p23),  and  estimates  of  one  model  can  be  transformed  into  the 
estimates  of  the  other.  This  means  that  if  we  use  two  different  methods  to  estimate 
these  two  types  of  models,  we  should  have  comparable  results.  This  implies  that 
if  the  conclusions  of  the  two  models  conflict,  then  at  least  one  method  is 
questionable.  This  kind  of  difference  between  Teebagy  &  Chatterjee's  method 
(1989)  and  Adkins'  method  (1990)  is  discussed  through  a  comparison  of 
bootstrapping  data  and  bootstrapping  residuals. 
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We  should  note  that  the  bootstrap  method  is  useful  only  in  small  samples. 
In  large  samples,  the  bootstrap  method  does  not  give  more  accurate  estimates 
while  taking  a  lot  more  computer  time  than  the  asymptotic  method. 

To  bootstrap  the  data,   that  is,   to  apply  the  nonparametric  bootstrap 
method,  we  first  generate  the  data  from  a  binary  response  variable  model: 


yt 


1  if  Xt'p+u,  >  0  j3  xj 

0  otherwise 


We  will  then  have  the  data  {Y,X)={{yvx1),'-;[yn,xr)}.  Next,  from  these  data  we 
randomly  draw  with  replacement  to  create  a  new  bootstrap  sample 
(Y\X*)  =  ((u\x*),  ••  \(y*l,x*))  Finally,  we  apply  the  maximum  likelihood  estimation 
method  to  the  bootstrap  sample  (Y*,X*)  to  get  an  estimate  of  the  parameter.  This 
procedure  is  repeated  B  times  (number  of  bootstraps)  to  either  form  a  bootstrap 
confidence  interval  of  the  parameter  or  to  get  a  bootstrap  point  estimate  of  the 
parameter  by  averaging  these  B  estimates.  Teebagy  and  Chatterjee  (1989)  apply 
this  modified  bootstrap  method  to  a  logit  model  in  a  Monte  Carlo  experiment.  They 
concluded  that  the  bootstrap  estimator  consistently  overestimates  the  true  value 
of  the  standard  errors  while  the  asymptotic  estimator  using  the  Fisher  information 
matrix  consistently  underestimates  them.  They  also  argued  that  in  small  samples 
the  bootstrap  standard  errors  are  substantially  closer  to  the  true  values  than  are 
the  asymptotic  standard  errors. 

To  bootstrap  the  residuals,  we  have  three  methods  to  discuss:  the 
parametric  bootstrap  method;  Adkins  bootstrap  method;  and  the  generalized 
residual  method  (this  method  will  be  discussed  in  the  next  section).  We  first 
generate  the  data  (Y,X)  from  the  binary  response  variable  model  of  equation  (3. 1). 
For   the   parametric   bootstrap    method,    in   each   iteration,   we   generate   an 
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error   u*  from  the  underlying  cumulative  distribution  function,  call  it  F,  to  get  the 
new  dependent  variable 


yf"" 


1  x,P+u,      >  0  (32) 

0  otherwise 


Then  we  can  apply  the  maximum  likelihood  estimation  method  to  the  new 
bootstrap  sample  (Y*(l,|,X)  to  get  an  estimate  p*"j>  of  the  parameter  (3.  This 
procedure  is  repeated  B  times. 

For  Adkins'  bootstrap  method,  we  make  a  different  modification  of  the 
bootstrap  method.  Instead  of  generating  errors  from  the  cumulative  distribution 
function  F,  we  generate  E*  from  the  uniform  distribution  of  (0, 1).  Also  we  use  the 
maximum  likelihood  estimate  p  of  the  true  parameter  P  from  the  original  sample 
of  (Y,X).  Then  for  each  bootstrap  replication,  a  binary  dependent  variable  y*  is 
given  as 


y;l,,) 


1  if    e-'WelO.Fta'fo]  (3.3) 

0  if   B'(b)e(F(x^),l] 


where  F  is  the  underlying  cumulative  distribution  function  of  error  u  in  model 
(3.1).  Thus  we  have  a  bootstrap  sample  (Y*,X),  from  which  we  can  estimate  the 
parameter  of  interest.  Adkins  (1990)  applied  this  method  to  a  probit  model  and 
concluded  that  this  bootstrap  method  is  not  superior  to  MLE  and  gives  unstable 
estimates,  making  it  inappropriate  for  the  probit  model. 

In  model  (3.2),  the  errors  {u,}  are  directly  generated  from  the  underlying 
cumulative  distribution  function  F.  If  the  cumulative  distribution  function  of  u,  is 
the  logistic  distribution,  we  have  the  logit  model.  If  the  cumulative  distribution 
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function  of  u,  is  the  standard  normal,  we  have  the  probit  model.  Actually,  the 
parametric  bootstrap  method  and  Adkins'  bootstrap  method  are  theoretically 
equivalent.  Let  us  consider  the  probability  of  Y,=  l  from  model  (3.1) 

P(y.=  l)  =  P(u(>-x;p) 
=  l-F(-x,'p) 

which  leads  to  the  generation  of  y;  from  model  (3.3)  by  Adkins'  bootstrap 
method. 

Even  though  the  two  methods  are  equivalent,  there  is  an  important  reason 
to  consider  them  as  two  different  methods.  In  particular,  while  they  may  be  almost 
equivalent  numerically,  the  parametric  bootstrap  method  is  more  widely  applicable 
than  Adkins'  bootstrap  method,  since  the  latter  is  restricted  to  binary  response 
variable  models.  The  parametric  bootstrap  method,  though,  can  be  applied  to  tobit 
models  as  well.  For  simplicity,  the  following  discussion  about  the  parametric 
bootstrap  method  also  applies  to  Adkins'  bootstrap  method. 

Returning  to  the  differences  between  bootstrapping  data  and  bootstrapping 
residuals,  if  we  have  a  sample  (Y,X)  from  the  binary  response  variable  model  of 
equation  (3.1),  then  we  can  choose  between  two  methods  to  estimate  the  model. 
The  first  one,  bootstrapping  data,  a  nonparametric  method,  is  to  directly  estimate 
the  model  by  using  the  maximum  likelihood  estimation  method  with  a  bootstrap 
sample  (Y*,X*).  The  second  method,  bootstrapping  residuals,  a  parametric  method, 
is  to  generate  the  error  term  from  the  underlying  cumulative  distribution  function, 
and  get  a  new  y*  according  to  equation  (3.2)  by  using  the  error  term  uft  the 
data  X,  and  the  estimated  parameter  p  from  the  original  sample  (Y,X).  This  forms 
a  new  bootstrap  sample  (Y*,X),  which  is  then  used  to  estimate  the  model. 
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Suppose  the  model  is  correctly  specified,  in  the  sense  that  the  error 
distribution  is  what  we  assume.  In  the  first  nonparametric  method,  we  only  have 
a  partial  observation  from  the  true  distribution,  which  may  not  be  representative 
of  the  population,  especially  in  small  samples.  This  sample  distribution  might  be 
skewed,  and  the  estimates  of  the  parameters  might  be  biased  because  we  are 
resampling  from  the  sample  data.  For  the  second  parametric  method,  if  we 
generate  y*  according  to  a  correctly  specified  error  distribution,  the  unusual 
features  of  the  sample  will  be  somewhat  mitigated.  The  parametric  method  should, 
therefore,  outperform  the  nonparametric  method  when  the  model  specification  is 
correct. 

When  the  model  is  misspecified,  however,  the  nonparametric  method  might 
work  better,  because  the  parametric  method  is  based  on  a  wrong  error 
distribution.  That  is,  the  parametric  method  is  not  only  inappropriate  for  the 
model,  but  also  uses  a  possibly  misspecified  model  to  change  the  observed  data 
from  Y  to  Y*.  Since  a  new  data  set  for  the  variable  Y*  is  generated  according  to  the 
misspecified  model  for  every  bootstrap  replication,  the  estimates  of  the  model 
would  be  pulled  away  from  the  true  parameter  in  the  same  direction  in  each 
replication.  Hence,  the  biases  would  accumulate  throughout  the  replications. 

We  should  note  that  the  maximum  likelihood  estimation  method  we  use  is 
a  parametric  estimation  method,  which  is  sensitive  to  the  assumption  of  the  model 
specification.  Hence,  only  the  correctly  specified  model  will  be  studied  in  this 
chapter. 

3.3   Generalized  Residuals 
Residuals  are  often  used  to  examine  the  adequacy  of  a  model  specification 
(Gourieroux  et  al.  1987b).  Though  this  method  did  not  work  for  the  purpose  at 
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hand,  nonetheless  it  remains  instructive  to  explore  this  approach.  The  generalized 
residual  method  is  used  to  estimate  a  regression  model.  Think  of  the  residuals  as 
the  estimates  of  the  errors  of  a  model.  Consider  a  linear  regression  model 

y(  =  p'x,  +  e,  1=1,2,"', n  (3.4) 

where  e,  is  the  error  of  the  model.  Assume  p  is  an  estimate  of  the  parameter  (5, 
then 

e,  =  yt  -  tfx,  (3-5) 

is  the  residual  for  the  i-th  observation.  In  the  case  of  a  nonlinear  model, 

yt  =  <?,(x(,p,e()  £=1,2, -,n.  (3.6) 

where  the  errors  |e,|  are  independently  and  identically  distributed,  and  the 
equation  for  the  i-th  observation  has  a  unique  solution  for  e,, 

st  =  h((u(,x,;P).  (3.7) 

This  defines  the  generalized  error  for  the  i-th  observation  of  the  model.  If  we 
replace  P  by  an  estimate  p  t  we  have  the  generalized  residual  defined  in  the  sense 
of  Cox  and  Snell  (1968), 

e,  =  ^.(y.x^p)  £=1.2.—  ,n  .  (3-8) 

If  the  data  are  censored,  as  in  the  logit  model,  we  can  not  use  this  definition 
of  the  generalized  residual  for  every  observation  y,,  since  it  depends  on  the 
unobservable  variable  y*.  In  other  words,  in  a  censored  regression  model,  it  is 
difficult  to  find  the  error  terms  directly.  It  seems  natural  to  replace  the  errors  (e^P)) 
by  their  best  prediction  (EB[e  (p)|y]|  .  This  leads  us  to  the  following  Gourieroux 
et  al.  (1987a)  definition.  The  generalized  error  for  the  i-th  observation  is 
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e,(P)  =  Ep[e((P)|y(] 
and  the  generalized  residual  for  the  i-th  observation  is 

6,  =  MP) 


(3.9) 


(3.10) 


where  p  is  the  ML  or  any  other  consistent  estimate  of  p. 

To  use  the  generalized  residual  bootstrap  method  to  estimate  a  model,  we 
assume  that  the  dependent  variables  follow  an  exponential  distribution,  i.e.  the 
p.d.f.  (probability  density  function)  can  be  written  as 


t(y(|xj;p)  =  exp[0/(x(;p)T(y()+A(x(;p)+B(x(,y()] 
Then  the  log-likelihood  function  of  the  latent  model  is 

L(P;u|x)=  J)[0/(x(;P)T(y()+A(xl;P)+B(x(,y,)l 


(3.11) 


(3.12) 


and  the  normal  equations,  as  proved  by  Gourieroux  et  al.  (1987a),  can  be  written 
as 


aL(P;y|x)   =  A3e'(fc*i) 


6f-0 


(3.13) 


ap         ti      ap 

where   e    is  a  generalized  residual  for  the  i-th  observation  and   p  is  an  estimate 
of  the  parameter  p. 

The  p.d.f.  of  a  dichotomous  logit  model  is 


Uy,k;P)  = 


exp(P'x,) 


l  +  exptp'x,) 


l-exp(p/xi) 


!-y, 


=  explp'xjy,  -  log[l  +  exp(P'x,)]] 
and  the  generalized  residual  is 


(3.14) 
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e,(p)  =  yt  -  ±___  (3.15) 

l  +  exp(-p  x,) 

which  is  the  difference  between  the  observed  y,  and  its  expectation.  With  this 

bilinear  exponential  density  function  (3.13),  the  normal  equations  from  (3.10)  and 

(3.12)  are 

£^8,-0  (3-16) 

While  solving  this  normal  equation,  we  will  have  exacUy  the  same  solution  as  the 
maximum  likelihood  estimate  for  both  the  logit  and  probit  models.  If  we  bootstrap 
the  generalized  residuals   rgi     we  obtain  the  bootstrap  generalized  residuals 
U*\      Then  the  new  bootstrap  normal  equations  are 

-»       -  (3.17) 


Since  we  need  to  solve  p  t   the  actual  nonlinear  equation  system  is 


£* 


(=i 


i 


l  +  expj-p'x) 


=  0  (3-18) 


where  the  i*  only  represents  the  position,  and  after  bootstrapping,  the  x.   is  no 
longer  the  same  as  x,  . 

When  we  used  Newton's  iterative  methods  to  solve  this  nonlinear  equation 
system  to  obtain  the  estimate  of  (3,  the  estimates  did  not  get  convergence.  The 
major  reason  for  this  might  be  that  the  bootstrapping  causes  x,  not  to  be  matched 
with  v  So  that  the  g*  is  not  necessarily  matched  with  xt  .  As  a  result,  we 
were  unable  to  use  the  generalized  residuals  method. 


41 

3.4   Monte  Carlo  Experiments  for  Bootstrap  Methods 
We  generate  the  data  from  the  following  binary  response  variable  model 


y( 


1  if  p0+p,x11+p2x2,+u(>0  (3  ig) 

0  otherwise 


to  compare  the  different  bootstrap  methods.  Four  methods  are  discussed:  the  logit 
maximum  likelihood  estimation  method,  Efron's  nonparametric  bootstrap  method, 
the  parametric  bootstrap  method,  and  Adkins'  parametric  bootstrap  method.  Since 
the  bootstrap  generalized  residual  method  did  not  converge,  we  can  not  present 
the  results  of  this  method.  The  model  we  describe  has  two  continuous  exogenous 
variables  and  an  intercept.  The  true  parameter  values  are  P0=0.4,  P,=  l.  and  P2=l. 
About  40  percent  of  the  observations  ofY  are  censored  in  the  experiment.  Both  of 
the  continuous  variables  are  drawn  from  the  uniform  distribution  over  the  range 
(-2,2).  The  sample  size  is  40. 

We  first  generate  two  exogenous  variables  from  U(-2,2).  We  then  generate 
the  errors.  The  procedure  for  generating  the  errors  and  estimating  the  parameters 
of  the  logit  model  by  Efron's  nonparametric  bootstrap  method,  which  is  the 
bootstrapping  data  method,  is  as  follows: 
Step  1:  Generate  two  exogenous  variables  from  U(-2,2)  and  errors  |u,|  from 

the  logistic  distribution,  then  get  ly,|  according  to  equation  (3.19)  to 

have  the  sample   (Y,X)={(yvxl),~-Ayn,xn)}. 
Step  2:  Bootstrap  the  sample  (Y,X)  in  pairs  by  repeatedly  randomly  picking 

n  pairs  of  {(y^x,)!  with  replacement  to  form  a  new  bootstrap  sample 

(y',x*)={(u;,x,*),--,(y,;,xf;)}. 

Step  3:  Estimate  the  logit  model  by  the  maximum  likelihood  estimation 

method  with  this  bootstrap  sample  (Y*,X*)   to  get  the  bootstrap 


42 

estimate  p*. 
Step  4:  Repeat  step  2  to  step  3  B=100  times. 

Step  5:  Find  the  mean  of  estimates,  E(Y),  and  sum  of  squared  differences 

between  y,  and  its  prediction. 
Step  6:  Repeat  step  1  through  step  5  M=1000  times  (this  is  the  super  loop) 

to  obtain  the  averages  of  the  bootstrap  estimates,  biases,  and  their 

mean  squared  errors. 
The  procedure  for  generating  the  errors  and  estimating  the  parameters  of 
the  logit  model  by  the  parametric  bootstrap  method,  which  is  the  bootstrapping 
residuals  method,  is  different  only  for  step  2  to  step  3  from  the  previous  method: 
Step  2:  Generate  errors  IjA  from  the  logistic  distribution,  then  get  !y*\ 

according  to  equation  (3.19)  to  have  a  new  bootstrap  sample  (Y*,X). 
Step  3:  Estimate  the  logit  model  by  the  maximum  likelihood  estimation 

method  with   this  bootstrap  sample   (Y*,X)   to  get  the  bootstrap 

estimate  P". 
The  procedure  for  generating  y,  and  estimating  the  parameters  of  the  logit 
model  by  Adkins'  parametric  bootstrap  residuals  method  is  different  only  for  step 
2  to  step  3  from  the  method  of  bootstrapping  data: 
Step  2:  Generate  E('  from  uniform  (0,1),  then  get  jy(*l  according  to  (3.3)  to 

have  a  new  bootstrap  sample   (y*,X)={(y,*,x1),---,  (y,;,xn)}. 
Step  3:  Estimate  the  logit  model  by  the  maximum  likelihood  estimation 

method  with  this  bootstrap  sample  (Y*,X)   to  get  the  bootstrap 

estimate  p*. 
The  purpose  of  this  Monte  Carlo  experiment  is  to  study  the  differences 
between  the  two  bootstrap  methods,  bootstrapping  data  and  bootstrapping 
residuals.  To  serve  this  purpose,  we  first  discuss  the  comparison  criteria. 
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To  compare  the  point  estimates,  we  will  first  check  their  biases  and  their 
mean  squared  errors  of  (3,  using  the  fact  that 

M  -      M  _  _ 

^E(p:-P)2  =  ^E(P*-PT+r-P)2  (3.20) 

[  MSE^  }  [  RSSfi  ]       [  BIASp  ] 

for  each  individual  p*.  Since  the  three  estimates  of  p's  may  behave  differently,  we 
might  need  two  kinds  of  overall  criteria  for  the  estimation.  The  first  set  is  the 
expectation  of  Y  and  its  bias. 

E(Y)  =  P[Y=1) 

=  p^>-(Pb+p1xll+p2jg] 
=  l-Fi-t^+p^+Pixgi 

For  the  logit  model  it  is  given  by 


E(Y)~±Y I (3.21) 

NU  l+exp[-(p*  +  p;xu  +  P2A:2()] 

For  the  probit  model  it  is  given  by 

E(Y)  =  lf  0[Po  +  p;x1(  +  P*x2,.]  (3-22) 

where  O  is  the  cumulative  distribution  function  of  the  standard  normal.  The 
approximately  true  value  of  E(Y)  is  estimated  by  using  true  values  of  P's  instead 
of  p"s  in  equations  (3.21)  and  (3.22). 

The  second  comparison  uses  the  sum  of  squared  differences  between  y,  and 
its  prediction,  denoted  as  sum  of  squared  residuals  of  Y  (SSRY).  We  also  consider 
the  sum  of  squared  generalized  residuals  of  Y  (SSGRY),  the  sum  of  squared 
differences  between  y,  and  its  expectation.  They  are 
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ssRY=Y,(y<-yV2 


1=1 


SSGRY  =  £ 


ur 


l 


i+exp[-(p,;+p;xu+p*x2()i 


(logit  model) 


(3.23) 


(3.24) 


SSGRY=Y,    y,-°(Po  +  P'x.,  +  P-2M  (Probit  model) 

where  the  prediction  of  Y  is 


(3.25) 


y« 


1 
0 


if  Po  +  P^u+fe  >0 

otherwise 


(3.26) 


As  we  know,  the  parametric  bootstrap  method  is  equivalent  to  Adkins' 
bootstrap  method  theoretically,  so  they  should  be  equivalent  in  Monte  Carlo 
experiment.  This  turns  out  to  be  true,  as  can  be  seen  by  examining  tables  3.1 
through  3.3.  There  is  no  significant  difference  in  any  criterion  between  these  two 
methods.  Keep  in  mind,  however,  that  the  parametric  bootstrap  method  is  more 
general. 

Comparing  the  differences  between  bootstrapping  data  and  bootstrapping 
residuals,  we  can  see  from  table  3.1  and  table  3.2  that  the  results  from  Efron's 
nonparametric  bootstrapping  data  method  are  very  close  to  the  results  from  the 
method  of  parametric  bootstrapping  residuals.  For  the  logit  model,  they  are  very 
close  on  biases,  and  the  MSE,,  of  the  nonparametric  bootstrap  method  is  only 
about  5%  less  than  on  MSEp  of  the  parametric  bootstrap  method.  For  the  probit 
model,  the  nonparametric  bootstrap  method  gives  lower  values  for  both  biases  and 
MSEp.  It  is  about  20%  less  for  the  MSEp  in  comparison  to  the  MSEp  from  the 
parametric  bootstrap  method.  But  when  we  look  at  table  3.3,  the  SSRY's  are 
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significantly  higher  for  Efron's  nonparametric  bootstrap  method  than  the 
parametric  bootstrap  method  for  both  the  logit  and  probit  models.  This  suggests 
that  the  parametric  bootstrap  method  is  more  reliable  than  the  method  of  directly 
bootstrapping  data.  Again,  we  should  note  that  we  are  dealing  with  the  parametric 
estimation  method,  i.e.  the  maximum  likelihood  estimation  method. 

Comparing  the  maximum  likelihood  estimation  (MLE)  method  with  the 
parametric  bootstrap  method,  we  can  see  that  the  estimates  of  the  parameters  are 
significanUy  less  biased  and  have  a  smaller  MSEp  for  the  MLE  than  for  the 
parametric  bootstrap  method  in  both  the  logit  and  probit  models  (table  3. 1  and 
table  3.2).  But  the  latter  gives  lower  SSRY's  for  both  the  logit  and  probit  models  by 
about  10%  (table  3.3).  Since  the  overall  expectations  E(Y)  are  both  the  same,  the 
parametric  bootstrap  method  might  be  better  because  of  the  smaller  confidence 
bounds  and  lower  variance  when  the  specification  is  correct. 

Comparing  the  estimation  between  the  logit  model  and  the  probit  model,  we 
can  see  from  tables  3. 1  through  3.3  that  there  are  greater  biases,  greater  RSS|4,  as 
well  as  greater  MSE|:1  for  the  estimates  of  the  probit  model  than  for  those  of  the 
logit  model,  which  has  greater  error  variance  at  the  time  of  generation.  For  the 
logit  model,  the  RSS0's  for  the  three  different  p"s  are  close.  But  for  the  probit 
model,  the  RSSp  of  po  is  smaller  than  that  from  the  logit  model.  However  the  RSS3's 
of  p,  and  P2  are  greater  than  those  from  the  logit  model.  Overall,  the  estimates  of 
the  probit  model  have  greater  variance  about  the  true  parameters,  but  have  a 
lower  SSRY  than  the  logit  model  (table  3.3).  In  addition,  the  iterations  for 
estimating  the  probit  model  converged  much  slower  than  those  of  the  logit  model. 

Regardless  of  the  models  or  the  methods,  all  have  excellent  estimations  of 
E(Y),  as  shown  in  tables  3.1  through  3.3. 
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From  table  3.3,  we  can  see  that  the  sum  of  squared  generalized  residuals 
(SSGRY)  and  sum  of  squared  residuals  (SSRY)  seem  to  have  the  same  power  as 
criteria.  Even  though  they  have  different  values,  they  have  the  same  pattern  of 
variation  for  both  the  logit  and  probit  models.  This  implies  that  the  generalized 
residuals  represent  the  residuals  well. 

In  the  case  of  misspecification,  a  parametric  estimation  method,  such  as 
the  maximum  likelihood  estimation  method,  would  be  sensitive.  Therefore  a 
nonparametric  (or  semiparametric)  bootstrap  method  with  a  nonparametric 
estimation  method  might  outperform  the  parametric  bootstrap  method. 

3.5   Summary 

In  a  correctly  specified  model,  with  an  efficient  parametric  estimation 
method,  the  parametric  bootstrap  estimation  method  gives  better  results  than  the 
nonparametric  bootstrapping  data  method.  The  parametric  bootstrap  method, 
which  is  more  general,  is  equivalent  to  Adkins'  bootstrap  method  in  these  binary 
response  variable  models.  The  parametric  bootstrap  method  gives  smaller  variance 
of  the  prediction  and  greater  mean  squared  errors  of  the  estimates  than  the 
maximum  likelihood  estimation  method. 

In  a  misspecified  model,  we  need  first  to  find  an  efficient  and  robust 
estimation  method,  then  according  to  the  parametric  property  of  this  estimation 
method  to  choose  an  appropriate  bootstrap  method.  Probably  the  nonparametric 
bootstrap  method  would  be  an  appropriate  method. 

For  a  correcdy  specified  logit  model,  the  parametric  bootstrap  method  with 
the  logit  maximum  likelihood  estimation  method  provides  the  most  reliable 
estimates  among  the  other  bootstrap  estimates.  For  a  correcUy  specified  probit 
model,  the  parametric  bootstrap  method  with  the  probit  maximum  likelihood 
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estimation  method  gives  the  most  reliable  estimates  among  the  other  bootstrap 
estimates. 

Using  the  maximum  likelihood  estimation  method,  the  estimates  of  the  logit 
model  are  more  reliable,  have  less  variance,  and  faster  convergence  than  the 
estimates  of  the  probit  model. 

Because  of  bootstrapping  the  nonlinear  equation  system,  we  were  not  able 
to  apply  the  bootstrap  method  to  the  generalized  residual  estimation  method. 
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Table  3.1:  Correctly  specified  logit  model  as  B=100,  N=40  and  M=1000. 


BETA 

TRUE 

MEAN 

BIAS 

RSS„ 

MSEB 

LOG  IT  MAXIMUM  LIKELIHOOD  ESTIMATION 

Po 

0.400 

0.501 

0.010 

0.386 

0.396 

Pi 

1.000 

1.188 

0.035 

0.348 

0.383 

P. 

1.000 

1.204 

0.042 

0.383 

0.424 

EM* 

0.544 

0.544 

0.000 

— 

-- 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

ft, 

0.400 

0.635 

0.055 

0.658 

0.713 

P, 

1.000 

1.473 

0.223 

0.642 

0.866 

P2 

1.000 

1.495 

0.245 

0.697 

0.942 

E(Y)- 

0.544 

0.544 

0.000 

— 

— 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

P„ 

0.400 

0.631 

0.053 

0.687 

0.740 

P, 

1.000 

1.473 

0.224 

0.702 

0.926 

P2 

1.000 

1.500 

0.250 

0.749 

0.999 

E(Y)' 

0.544 

0.544 

0.000 

-- 

— 

ADKINS'  PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

0.400 

0.635 

0.055 

0.686 

0.741 

P, 

1.000 

1.473 

0.224 

0.681 

0.905 

P2 

1.000 

1.499 

0.249 

0.769 

1.018 

E(Y)' 

0.544 

0.544 

0.000 

~ 

— 

*   the  value  for  E(Y)  is  approximate. 
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Table  3.2:  Correctly  specified  probit  model  as  B=100,  N=40  and  M=500. 


BETA 

TRUE 

MEAN 

BIAS 

RSSB 

MSEB 

PROBIT  MAXIMUM  LIKELIHOOD  ESTIMATION 

Po 

0.400 

0.484 

0.007 

0.153 

0.160 

P. 

1.000 

1.268 

0.072 

0.510 

0.582 

P2 

1.000 

1.254 

0.064 

0.535 

0.599 

E(Y)* 

0.636 

0.638 

0.000 

-- 

— 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

Po 

0.400 

0.613 

0.045 

0.310 

0.356 

P, 

1.000 

1.665 

0.442 

0.856 

1.298 

P2 

1.000 

1.631 

0.398 

0.885 

1.284 

E(Y)* 

0.636 

0.637 

0.000 

— 

— 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

0.400 

0.644 

0.059 

0.353 

0.413 

P, 

1.000 

1.763 

0.583 

1.188 

1.771 

Pa 

1.000 

1.736 

0.542 

1.197 

1.738 

EfY)' 

0.636 

0.637 

0.000 

-- 

-- 

ADKINS'  PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

0.400 

0.631 

0.054 

0.341 

0.395 

Pi 

1.000 

1.767 

0.588 

1.255 

1.843 

P2 

1.000 

1.738 

0.544 

1.232 

1.776 

E(Y)" 

0.636 

0.637 

0.000 

~ 

— 

the  value  for  E(Y)  is  approximate. 


Table  3.3:  Comparison  of  criteria  among  different  methods. 
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METHOD 

SSRY 

SSGRY 

BIAS  OF  E(Y) 

LOGIT  MODEL  ESTIMATION  (M=: 

1000) 

LOGIT  MLE 

8.691 

5.877 

0.000 

EFRON 

19.038 

13.909 

0.000 

PARAMETRIC 

7.834 

5.337 

0.000 

ADKINS 

7.846 

5.338 

0.000 

PROBIT  MODEL  ESTIMATION  (M= 

=500) 

PROBIT  MLE 

6.322 

4.296 

0.000 

EFRON 

17.642 

14.195 

0.000 

PARAMETRIC 

5.598 

3.787 

0.000 

ADKINS 

5.613 

3.800 

0.000 

CHAPTER  4 
BOOTSTRAP  METHODS  IN  THE  TOBIT  MODEL 

4. 1    Introduction 

Many  of  the  recent  developments  in  econometric  methods  have  been  in  the 
area  of  limited  dependent  variable  models,  that  is,  regression  models  where  the 
range  of  the  dependent  variable  is  restricted  to  some  subset  of  the  real  line.  The 
regression  model  with  a  nonnegative  constraint  on  the  dependent  variable,  the  so- 
called  tobit  model,  was  proposed  by  Tobin  (1958).  The  strong  consistency  and  the 
asymptotic  normality  of  the  maximum  likelihood  estimator  of  the  tobit  model  were 
proved  by  Amemiya  (1973).  And  it  was  shown  by  Olsen  (1978)  that  if  the  iterative 
process  of  the  maximum  likelihood  estimation  (MLE)  yields  a  solution,  it  will  be 
the  global  maximum  of  the  likelihood  function;  i.e.,  with  the  tobit  MLE  method, 
given  any  initial  value,  if  it  converges,  then  the  estimator  will  be  the  only 
consistent  and  asymptotically  normal  maximum  likelihood  estimator. 

However,  it  is  well  known  that  the  tobit  ML  estimator  is  sensitive  to  the 
assumptions  of  normality  and  homoskedasticity.  The  presence  of  either 
nonnormality  or  heteroskedasticity  can  result  in  inconsistency  of  the  maximum 
likelihood  estimator.  There  are  several  papers  discussing  the  sensitivity  to 
nonnormality  (Arabmazar  &  Schmidt  1982,  Goldberger  1983)  and  the  sensitivity 
to  heteroskedasticity  (Arabmazar  &  Schmidt  1981,  Hurd  1979)  of  the  model. 
Powell  (1984)  proposed  an  alternative  to  the  maximum  likelihood  estimator,  which 
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is  a  generalization  of  the  least  absolute  deviations  estimation  for  a  standard  linear 
model,  that  is  robust  to  heteroskedasticity.  Later  on,  he  (Powell  1986)  proposed 
a  symmetrically  censored  least  squares  estimator.  Both  estimators  have  certain 
robustness  properties,  but  can  be  very  inefficient  under  the  correct  specification 
for  they  disregard  the  information  contained  in  the  parametric  assumptions. 
Peracchi  (1990)  introduced  a  class  of  bounded-influence  estimators  for  the  tobit 
model.  These  estimators  provide  a  compromise  between  efficiency  and  robustness, 
thereby  attaining  high  efficiency  in  the  tobit  model  and  being  robust  in  probability 
distribution. 

Efron  (1981)  applied  a  nonparametric  bootstrap  method  to  censored  data 
to  keep  the  property  of  censoring  by  bootstrapping  data  directly.  Flood  (1985) 
introduced  an  augmented  semiparametric  bootstrap  method  to  obtain  standard 
errors  of  system  tobit  coefficients,  but  this  method  does  not  retain  the  property  of 
censoring  of  the  data. 

In  this  chapter,  we  investigate  the  differences  between  bootstrapping  data 
and  bootstrapping  residuals  in  the  tobit  model.  To  this  end,  we  also  propose  a 
mixed,  semiparametric,  bootstrap  method  based  on  the  tobit  MLE;  and  we  apply 
the  balanced  resampling  technique  to  Efron's  nonparametric  bootstrap  method  to 
estimate  the  tobit  model. 

4.2  Applications  of  Bootstrap  Methods  to  the  Tobit  Model 
There  are  many  ways  to  get  estimators  of  the  tobit  model  under  the 
assumptions  of  the  model.  For  instance,  the  probit  maximum  likelihood  estimator 
(Amemiya  1978)  is  consistent;  Heckman's  two-step  estimation  (Heckman  1976)  is 
consistent;  the  tobit  maximum  likelihood  estimator  is  strongly  consistent  and 
asymptotically  normal  (Amemiya  1973)  and  unique  (Olsen  1978)  if  the  iteration 
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converges.  In  addition,  weighted  least  squares,  nonlinear  least  squares,  nonlinear 
weighted  least  squares,  EM  algorithm  (Hartley  1958),  censored  least  absolute 
deviations  estimation  (Powell  1984),  symmetrically  censored  least  squares 
estimation  (Powell  1986)  and  many  more  techniques  can  be  used. 

With  the  correct  specification,  because  of  its  strong  consistency  and 
uniqueness,  the  tobit  maximum  likelihood  estimator  is  more  efficient  and  also 
easy  to  obtain.  Combining  the  tobit  MLE  with  a  bootstrap  method,  we  can  get 
several  bootstrap  estimators.  The  purpose  of  this  chapter  is  to  compare  bootstrap 
methods  by  comparing  the  bias  reduction  and  mean  squares  error  (MSEp) 
reduction  of  the  different  bootstrap  estimates,  and  also  the  relative  sum  of  squared 
residuals  ofY  (SSRY). 

The  tobit  model  is 


y< 


Po+IVh+P^u,       if  RHSt>o  (41) 

0  otlierwise 


where  the  u,  are  independently  and  identically  distributed  normal  with  mean  zero 
and  variance  a2.  Let  x,=(l  x„  x^)  and  P=(P0  p,  (32)  be  column  vectors.  We  can  derive 
the  augmented  bootstrap  procedure  as  the  following:  First,  estimate  the  model  by 
the  tobit  MLE  and  compute  q+>  where  q*  is  the  vector  of  residuals  for  the 
observations  for  which  the  y,'s  are  positive.  Second,  an  augmented  residual 
vector  n  is  constructed  where  u=[u*  I  -u*].  If  the  total  sample  size  is  N  and  the 
y,'s  are  positive  for  r  observations,  then  this  vector  a  will  be  of  order  2r. 
Third,  £  is  resampled  with  replacement  to  create  a  bootstrap  sample  u*  of  size  N. 
Fourth,  y*  is  constructed  using  u*  and  p  according  to  iy*=max(RHS*,0)  from 
equation  (4. 1).  This  new  bootstrap  sample  (Y*,X)  is  used  to  compute  the  bootstrap 
estimate  of  p.  The  procedure  is  repeated  to  get  the  average  bootstrap  estimates  of 
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p.  Through  this  semiparametric  bootstrapping  procedure,  we  can  see  that,  first, 
if  we  want  the  bootstrap  sample  to  maintain  the  censoring  property,  then  (y,x) 
should  be  paired  as  in  Efron  (1981).  This,  however,  is  not  the  case  in  the 
augmented  bootstrap  method  proposed  by  Flood  (1985).  Second,  the  augmented 
errors  have  been  forced  to  be  symmetric,  which  may  not  be  the  case  in  reality. 
Therefore,  we  will  propose  two  more  bootstrap  methods  besides  the  augmented 
semiparametric  bootstrap  method,  Efron's  nonparametric  bootstrap  method,  and 
the  parametric  bootstrap  method. 

The  first  method,  which  is  a  nonparametric  bootstrap  method,  applies  the 
balanced  resampling  technique  to  Efron's  nonparametric  bootstrap  method  to 
reduce  the  bias  in  Efron's  bootstrap  estimates.  To  bootstrap  B  times,  we  copy  the 
original  sample  i{y  ,x),(y  ,x,),-  •  -,(y  ,x)\  B  times  to  make  a  group  with  BxN 
pairs  of  (y,x),  then  randomly  draw  without  replacement  to  form  B  bootstrap 
samples  of  size  N.  For  each  bootstrap  sample,  we  can  use  the  tobit  MLE  to  get  the 
balanced  bootstrap  estimates. 

The  second  method,  which  is  semiparametric,  mixes  the  augmented 
bootstrap  with  Efron's  nonparametric  bootstrap  method.  For  residuals  JqA  we 
have  noncensored  positive  (y,|,  thus  we  keep  corresponding  pairs  of  (x.u/).  But 
for  residuals  l-u^l,  we  do  not  know  if  the  corresponding  observation  is  censored 
or  not.  We  may  choose  the  same  kinds  of  x,  to  pair  with  them  by  randomly 
drawing  x,  with  replacement  from  the  entire  set  of  x.  Then  we  get  the  new  pair 
°f  {xk,-i£).   Finally  we  can  form  the  augmented  sample  as 

(^,uA)  ={(x(,^)l(xfc,-u<l}  <4-2) 

Then  (xA,uA)  is  resampled  with  replacement  to  create  a  bootstrap  sample  (x*,u*)  of 
size  N.  The  remaining  steps  follow  from  the  augmented  bootstrap  method. 
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So  with  Efron's  bootstrap  or  balanced  bootstrap  methods,  we  avoid  forcing 
the  errors  to  be  symmetric  and  we  maintain  the  censoring  property.  With  the 
mixed  augmented  bootstrap  method,  we  force  the  errors  to  be  symmetric  and 
partially  maintain  the  censoring  property. 

To  estimate  the  model  by  the  tobit  MLE,  we  use  Fair's  iteration  method  (Fair 
1977).  Let 

O,  =    P'^-Le-'^dt  (4.3) 


4>,  =  _J_e-[*'x>2/2°2  (4.4) 


/2n 
For  model  (4.1),  we  have  the  log-likelihood  function 

logL=£  log(l  -«>,)+£  log( J— )-E  itM.-P'^"  (4.5) 

where  the  summation  En  is  over  the  N0  observations  for  which  y,=0,  and  the 
summation  E,  is  over  the  N,  observations  for  which  y,>0.  From  the  first-order 
condition  for  a  maximum,  we  have  (see  Maddala  1983,  pi 52- 153) 

o2  =  J_£(yrP'^y(  (4.6) 

iVi  i 

P  =  PLS  -  aVCl%r1X0%  (4"7) 

where  pus  is  the  least  squares  estimator  for  p  obtained  from  the  N,  nonzero 
observations  on  y.  x'  is  a  3xN,  matrix  of  values  of  x,  for  nonzero  y,.  xJ  isa3xN„ 
matrix  of  values  ofx,  for  y,=0.  y  '=(y  ,•  •  -,y  )  is  a  lxN0  vector  of  values  of  y,  for 
y,=0,  where 


y, 
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*«  (4.8) 


l-O, 

Then  Fair's  iteration  method  for  obtaining  the  maximum  likelihood  estimates  of 
p  and  o2  fr°m  equations  (4.6)  and  (4.7)  can  be  processed  with  X=OA  (see  Maddala 
1983,  pl54). 

For  the  bootstrap  estimation  of  the  model,  the  mean  of  the  squared  errors 
of  (3  (MSE3)  can  be  partitioned  into  two  terms: 

J_£  (p«M_p)2  =  ±y,  (p*,m)-p)2  +  (p-p)2  (4-9) 

where  p  •*"*  is  the  average  bootstrap  estimate  of  the  true  parameter  p  in  the  m-th 
replication,  and  p  is  the  average  of  {p  *<"<>}  over  M  replications.  The  first  term  in 
right  hand  side  of  equation  (4.9)  is  the  residual  sum  of  squares  of  p  (RSS3),  and 
the  second  term  is  the  bias.  For  a  given  data  set,  the  better  method  of  estimation 
should  have  a  lower  level  of  bias  and/or  lower  level  of  MSE3. 

4.3  Monte  Carlo  Experiments  for  Bootstrap  Methods 
We  generate  the  data  from  the  tobit  model  of  equation  (4. 1)  to  compare  the 
different  bootstrap  methods  by  their  estimates.  Five  bootstrap  methods  are 
applied:  Efron's  nonparametric  bootstrapping  data  method;  Efron's  nonparametric 
bootstrap  method  modified  by  the  balanced  resampling  technique;  Flood's 
augmented  semiparametric  bootstrap  method;  the  mixed  augmented 
semiparametric  bootstrap  method;  and  the  parametric  bootstrapping  residuals 
method.  We  estimate  three  models. 

The  first  model  has  two  continuous  exogenous  variables  and  an  intercept. 
The  true  parameter  values  are  p0=-3,  £,=0.5,  and  p2=0.2.  The  continuous  variable 
x{  takes  on  values  between  2.4  and  7.6  in  even  increments,  and  x^  is  randomly 
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drawn  from  the  uniform  distribution  over  the  range  (0,5).  The  sample  size  is  N=40. 
The  purpose  of  the  first  Monte  Carlo  experiment  is  to  see  how  the  proposed 
bootstrap  methods  work. 

The  detailed  procedure  for  estimating  the  first  tobit  model  by  the  first  four 
bootstrap  methods  mentioned  above  is  as  follows: 
Step  1:  Generate  a  random  sample  of  data  according  to  (4.1)  with  error 

terms  distributed  from  the  standard  normal. 
Step  2:  Obtain  estimates  of  the  parameters  of  the  tobit  model  using  the  tobit 

MLE. 
Step  3a:  Bootstrap  the  sample  using  Efron's  nonparametric  bootstrap  method 

to  get  B=100  bootstrap  samples,  then  estimate  the  model  with  the 

tobit  MLE  for  each  bootstrap  sample.  Finally,  find  the  mean  of  the 

bootstrap  estimates. 
Step  3b:  Bootstrap  the  sample  with  the  balanced  bootstrap  method  to  get 

B=100  bootstrap  samples.  Estimate  the  model  by  the  tobit  MLE  with 

each  bootstrap  sample,  and  then  find  the  mean  of  the  bootstrap 

estimates. 
Step  3c:  Bootstrap  the  sample  with  the  augmented  bootstrap  method  to  get 

B=100  bootstrap  samples.  Next  estimate  the  model  using  the  tobit 

MLE  for  each  bootstrap  sample,  and  then  find  the  mean  of  the 

bootstrap  estimates. 
Step  3d:  Bootstrap  the  sample  with  the  mixed-augmented  bootstrap  method 

to  get  B=100  bootstrap  samples;  next  estimate  the  model  using  the 

tobit  MLE  for  each  bootstrap  samples,  then  find  the  mean  of  the 

bootstrap  estimates. 
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Step  4:  Repeat  step  1  through  step  3  M=100  times  (this  is  the  super  loop)  to 

obtain  the  averages  of  the  estimates  and  the  residual  sum  of  squares 

(RSSh)  for  each  bootstrap  method  and  the  tobit  maximum  likelihood 

estimates. 

Comparing  those  four  bootstrap  methods  in  table  4.1,  the  balanced 
resampling  method  is  almost  equivalent  to  Efron's  nonparametric  bootstrap 
method,  but  with  a  significant  increase  in  computer  time.  The  mixed  augmented 
bootstrap  method  greatly  reduces  RSS^'s,  but  at  the  same  time  it  enlarges  the 
biases  significantly,  which  causes  the  method  to  be  inefficient. 

For  the  other  three  bootstrap  methods,  we  repeat  the  same  Monte  Carlo 
study  with  500  replications.  Those  results  are  presented  in  table  4.2  and  table  4.4. 
The  third  model  we  choose  has  two  continuous  exogenous  variables  generated 
from  the  uniform  distribution  over  the  range  (-2,2),  and  an  intercept.  The  true 
parameter  values  are  p0=0.4,  (3,=  1.0,  and  P2=i-°-  Tne  sample  size  is  still  40.  We 
use  these  Monte  Carlo  experiments  to  see  the  differences  between  bootstrapping 
data  and  bootstrapping  residuals. 

Comparing  the  differences  between  bootstrapping  data  and  bootstrapping 
residuals,  we  can  see  from  table  4.2  and  table  4.3  that  the  results  from  Efron's 
nonparametric  bootstrapping  data  method  are  very  close  to  the  results  from  the 
parametric  bootstrapping  residuals  method,  except  for  the  large  bias  for  E(Y)  from 
the  first  method.  But  when  we  look  at  table  4.4  for  the  tobit  II  and  tobit  III  models, 
the  SSR/s  (sum  of  squared  residuals  of  Y)  are  significantly  higher  for  Efron's 
nonparametric  bootstrap  method  than  for  the  parametric  bootstrap  method.  This 
suggests  that  the  parametric  bootstrap  method  is  reliable  and  has  less  variance 
than  the  method  of  directly  bootstrapping  data  in  the  estimation  of  the  tobit 
model.  Here,  we  should  mention  that  we  are  dealing  with  the  parametric 
estimation  method,  i.e.  the  maximum  likelihood  estimation  method. 
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Comparing  Flood's  augmented  semiparametric  bootstrap  method  with  the 
parametric  bootstrap  method  in  models  II  and  III  (table  4.2  &  table  4.3),  we  can 
see  that  the  results  from  these  two  methods  are  nearly  equivalent.  And,  as  we  look 
at  table  4.4,  the  SSRY's  are  a  little  smaller  for  the  parametric  bootstrap  method, 
but  equivalent  for  the  SSGR/s.  Comparing  the  maximum  likelihood  estimation 
method  with  the  augmented  bootstrap  method  or  the  parametric  bootstrap 
method,  we  can  see  from  tables  4.2  to  4.4  that  the  maximum  likelihood  estimation 
method  has  smaller  MSE^'s,  but  larger  SSRY's.  So  the  augmented  bootstrap 
method  or  the  parametric  bootstrap  method  should  provide  reliable  estimates. 

From  table  4.4,  we  can  see  that  the  sum  of  squared  generalized  residuals 
(SSGRY)  and  the  sum  of  squared  residuals  of  Y  (SSRY)  seem  to  have  the  same 
power  as  criteria.  Even  though  they  have  different  values,  they  have  a  similar 
pattern  of  variation  for  both  models.  This  implies  that  the  generalized  residuals 
represent  the  residuals  well  in  the  tobit  model. 

4.4  Summary 
In  a  correctly  specified  tobit  model,  with  an  efficient  parametric  estimation 
method,  the  augmented  semiparametric  bootstrap  method  and  the  parametric 
bootstrap  estimation  method  will  give  better  results  than  the  nonparametric 
bootstrapping  data  method  and  the  tobit  maximum  likelihood  estimation  method. 
The  parametric  bootstrap  method,  which  is  widely  applicable,  is  almost  equivalent 
to  Flood's  augmented  bootstrap  method  in  the  tobit  model. 
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Table  4.1:  Correctly  specified  tobit  model  (I)  as  B=100,  N=40  and  M=100. 


BETA 

TRUE 

MEAN 

BIAS 

RSSB 

MSE„ 

TOBIT  MAXIMUM  LIKELIHOOD  ESTIMATION 

Po 

-3.000 

-3.132 

0.017 

0.724 

0.741 

Pi 

0.500 

0.506 

0.000 

0.021 

0.021 

P2 

0.200 

0.224 

0.001 

0.017 

0.018 

E(Y) 

-0.033 

-0.075 

0.002 

-- 

— 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

Po 

-3.000 

-3.211 

0.045 

0.788 

0.833 

Pi 

0.500 

0.518 

0.000 

0.022 

0.022 

P2 

0.200 

0.229 

0.001 

0.017 

0.018 

E(Y) 

-0.033 

-0.082 

0.002 

-- 

-- 

BALANCED  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

Po 

-3.000 

-3.202 

0.041 

0.787 

0.828 

P. 

0.500 

0.518 

0.000 

0.022 

0.022 

P2 

0.200 

0.229 

0.001 

0.017 

0.018 

E(Y) 

-0.033 

-0.073 

0.002 

-- 

-- 

FLOOD'S  SEMIPARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

-3.000 

-3.109 

0.012 

0.723 

0.735 

Pi 

0.500 

0.505 

0.000 

0.020 

0.020 

P2 

0.200 

0.224 

0.001 

0.016 

0.017 

E(Y) 

-0.033 

-0.057 

0.001 

-- 

-- 

MIXED-AUGMENTED  SEMIPARAMETRIC  BOOTSTRAP 

Po 

-3.000 

-2.201 

0.638 

0.470 

1.108 

Pi 

0.500 

0.391 

0.012 

0.014 

0.026 

P2 

0.200 

0.180 

0.000 

0.012 

0.012 

E(Y) 

-0.033 

0.179 

0.045 

— 

— 
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Table  4.2:  Correctly  specified  tobit  model  (II)  as  B=100,  N=40  and  M=500. 


BETA 

TRUE 

MEAN 

BIAS 

RSS^ 

MSE„ 

TOBIT  MAXIMUM  LIKELIHOOD  ESTIMATION 

Po 

-3.000 

-3.077 

0.006 

0.808 

0.814 

Pi 

0.500 

0.504 

0.000 

0.022 

0.022 

P2 

0.200 

0.213 

0.000 

0.017 

0.017 

E(Y)* 

0.361 

0.369 

0.000 

— 

— 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

Po 

-3.000 

-3.133 

0.018 

0.873 

0.891 

Pi 

0.500 

0.511 

0.000 

0.023 

0.023 

P2 

0.200 

0.216 

0.000 

0.017 

0.017 

E(Y)* 

0.361 

0.257 

0.011 

— 

— 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

-3.000 

-3.141 

0.020 

0.878 

0.898 

P. 

0.500 

0.511 

0.000 

0.023 

0.023 

P2 

0.200 

0.217 

0.000 

0.017 

0.018 

E(Y)- 

0.361 

0.377 

0.000 

-- 

-- 

FLOOD'S  SEMIPARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

-3.000 

-3.091 

0.008 

0.895 

0.903 

P, 

0.500 

0.505 

0.000 

0.024 

0.024 

P2 

0.200 

0.214 

0.000 

0.017 

0.017 

E(Y)' 

0.361 

0.380 

0.000 

— 

— 

the  value  for  E(Y)  is  approximate. 
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Table  4.3:  Correctly  specified  tobit  model  (III)  as  B=100,  N=40  and  M=500. 


BETA 

TRUE 

MEAN 

BIAS 

RSS0 

MSEB 

TOBIT  MAXIMUM  LIKELIHOOD  ESTIMATION 

ft, 

0.400 

0.392 

0.000 

0.045 

0.045 

Pi 

1.000 

1.012 

0.000 

0.031 

0.031 

P2 

1.000 

1.000 

0.000 

0.033 

0.033 

E(Y)' 

0.892 

0.898 

0.000 

— 

-- 

EFRON'S  NONPARAMETRIC  BOOTSTRAPPING  OF  DATA 

ft, 

0.400 

0.380 

0.000 

0.047 

0.047 

P, 

1.000 

1.019 

0.000 

0.031 

0.031 

P2 

1.000 

1.003 

0.000 

0.034 

0.034 

EM* 

0.892 

0.564 

0.108 

— 

— 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

Po 

0.400 

0.380 

0.000 

0.047 

0.047 

Pi 

1.000 

1.019 

0.000 

0.032 

0.032 

P2 

1.000 

1.004 

0.000 

0.034 

0.034 

E(Y)* 

0.892 

0.903 

0.000 

-- 

-- 

FLOOD'S  SEMIPARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

ft, 

0.400 

0.387 

0.000 

0.047 

0.047 

Pi 

1.000 

1.014 

0.000 

0.032 

0.032 

P2 

1.000 

0.999 

0.000 

0.033 

0.033 

E(Y)' 

0.892 

0.904 

0.000 

— 

— 

*   the  value  for  E(Y)  is  approximate. 
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Table  4.4:       Comparison  of  criteria  among  different  methods  for  tobit  models  11 
and  III. 


METHOD 

SSRY 

SSGRY 

BIAS  OF  E(Y) 

TOBIT  MODEL  (II)  ESTIMATION  RESULTS  (M= 

:500) 

TOBIT  MLE 

15.455 

2.857 

0.000 

EFRON 

32.795 

11.188 

0.011 

PARAMETRIC 

14.201 

2.599 

0.000 

FLOOD 

14.288 

2.582 

0.000 

TOBIT  MODEL  (III)  ESTIMATION  RESULTS  (M: 

=500) 

TOBIT  MLE 

18.669 

1.782 

0.000 

EFRON 

115.210 

33.665 

0.108 

PARAMETRIC 

16.917 

1.578 

0.000 

FLOOD 

17.227 

1.565 

0.000 

CHAPTER  5 


TESTS  OF  HYPOTHESES  IN  LIMITED 
DEPENDENT  VARIABLE  MODELS 


5.1    Introduction 

Three  general  principles  employed  for  hypothesis  testing  in  econometrics 
are  the  Wald  (W),  likelihood  ratio  (LR),  and  Lagrange  multiplier  (LM)  criteria.  The 
W  test  was  introduced  by  Wald  (1943).  Aitchison  and  Silvey  (1958),  and  Silvey 
(1959)  first  developed  the  LM  test.  The  LM  test  is  also  the  same  as  the  score  test, 
Rao  (1947).  Although  those  hypothesis  tests  consider  the  general  issue  of 
hypothesis  testing  from  different  perspectives  and  have  different  critical  regions 
for  small  samples,  asymptotically  the  three  procedures  perform  identically. 

For  testing  linear  restrictions  on  the  coefficients  of  certain  linear  models. 
Savin  (1976),  Berndt  and  Savin  (1977),  and  Breusch  (1979)  showed  that  there 
exists  a  systematic  numerical  inequality1  between  the  test  statistics.  Specifically, 
this  is  W  >  LR  >  LM.  Because  of  this  inequality,  in  use  there  may  be  conflicts 
among  these  tests,  i.e.  sometimes  one  rejects  a  null  hypothesis  using  one  test  but 
another  test  fails  to  reject  the  null.  Two  problems  arise  from  using  the  asymptotic 
chi-square  distribution  as  an  approximation.  Evans  and  Savin  (1982)  reported 


1  The  inequality  relation  only  holds  for  a  general  linear  model  with  normal 
disturbances  provided  that  the  unknown  elements  of  the  covariance  matrix  can  be 
estimated  by  maximum  likelihood  (ML)  and  the  ML  estimates  of  the  coefficient  parameters 
are  asymptotically  uncorrelated  with  those  of  the  covariance  matrix  parameters. 
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that  the  probability  of  conflict  can  be  substantial  when  the  three  tests  are  based 
on  the  asymptotic  chi-square  critical  value.  They  also  concluded  that  in  the 
classical  linear  regression  model  the  conflict  between  the  W,  LR,  and  LM  tests  is 
due  to  the  tests  not  having  the  correct  significance  level.  This  is  another  major 
problem  of  these  three  tests.  Note  that  there  is  no  conflict  between  the  three  tests 
when  they  are  based  on  exact  distributions  (Evans  and  Savin,  1982). 

Breusch  and  Pagan  (1979),  Godfrey  (1978),  and  Griffiths  and  Surekha 
(1986)  found  in  their  Monte  Carlo  experiments  that  the  LM  test  rejects  the  null 
hypotheses  less  frequenUy  than  indicated  by  its  nominal  size.  In  other  words,  the 
nominal  size  of  the  test  tends  to  overestimate  the  true  probability  of  type  I  error 
in  finite  samples. 

There  are  two  kinds  of  correction  methods  that  can  be  used  to  solve  the 
significance  level  problem  of  these  three  tests  in  general  linear  regression  models. 
One  is  to  adjust  the  critical  value  of  the  tests.  Harris  (1985)  proposed  a  general 
size-corrected  LM  test  procedure  with  a  rigorous  theoretical  grounding.  With 
tedious  algebra,  Honda  (1988)  applied  Harris'  method  to  provide  the  formula  for 
the  size  correction  to  the  LM  test  for  heteroskedasticity.  The  second  kind  of 
correction  method  is  to  modify  the  test  statistic.  Evans  and  Savin  (1982)  compared 
two  correction  methods,  one  from  Gallant  (1975)  and  the  other  from  Rothenberg 
(1977),  and  concluded  that  the  three  Edgeworth  size-corrected  tests  have  almost 
right  significance  levels  and  that  the  probability  of  conflict  between  the  size 
corrected  tests  is  of  no  consequence  under  commonly  satisfied  conditions. 

For  nonlinear  regression  models,  the  inequality  relation  between  values  of 
statistics  is  no  longer  available.  Thus,  it  will  be  interesting  to  see  if  there  is  any 
conflict  between  these  tests.  Hauck  and  Donner  (1977)  argued  that  there  can  be 
substantial  conflict  in  the  results  of  the  Wald  and  LR  tests  for  the  logit  model. 
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Recent  studies  show  that,  for  the  logit  model,  we  do  not  have  to  rely  on  the 
asymptotic  distribution  of  a  test  statistic.  Because  we  can  get  exact  inference 
conditional  on  the  sufficient  statistic.  For  details,  see  the  survey  of  exact  inference 
for  contingency  tables  by  Agresti  (1992). 

There  has  been  a  lot  of  effort  devoted  to  solving  the  significance  level 
problem  in  nonlinear  models.  Gallant  (1975)  has  suggested  using  degrees  of 
freedom  corrections  in  nonlinear  models,  and  Rothenberg  (1977)  has  suggested 
using  Edgeworth  size-corrections  in  the  multivariate  regression  model.  Rocke 
(1989)  applied  the  bootstrap  Bartlett  adjustment  to  the  likelihood  ratio  test 
statistic  for  the  seemingly  unrelated  regression  model.  Rayner  (1990),  using 
Edgeworth  expansions,  showed  that  a  bootstrap  Bartlett  adjustment  to  the  LR  test 
statistic  may  be  used  to  estimate  p  values  with  error  of  order  improved  to  n"3/2,  but 
for  the  W  and  LM  tests  there  is  not  any  improvement. 

Davidson  and  MacKinnon  (1984)  proposed  several  LM  tests  and  a  LR  test 
for  the  logit  and  probit  models.  They  found  one  of  the  LM  tests  outperforms  the 
other  tests  by  having  more  accurate  type  I  error  with  respect  to  the  chi-square 
distribution.  But  none  of  the  tests  has  clearly  larger  power.  Taylor  (1991) 
compared  two  kinds  of  LM  tests  for  the  tobit  model.  Instead  of  the  asymptotic  chi- 
square  critical  values,  he  used  empirical  finite  sample  critical  values  from  a 
simulated  exact  distribution  by  generating  ten  thousand  replications  for  each 
sample  size.  He  concluded  that  the  Hessian  LM  test  would  be  more  powerful  than 
the  outer-product  of  the  gradient  variant  of  the  LM  test.  His  choice  of  LM  test 
coincides  with  that  of  Davidson  and  MacKinnon. 

Horowitz  (1991)  applied  a  bootstrap  method  to  a  set  of  Monte  Carlo 
experiments  and  showed  that  the  use  of  bootstrap-based  critical  values  eliminates 
the  same  kind  of  significance  level  problem  for  White's  information  matrix  test. 
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As  we  have  seen,  there  are  two  major  problem  with  these  three  tests 
because  of  using  the  asymptotic  chi-square  distribution  as  an  approximation:  The 
first  is  the  significance  level  problem,  and  the  second  is  the  conflict  among  these 
three  tests.  The  purpose  of  this  chapter  is  to  apply  bootstrap  methods  to 
approximate  the  exact  distribution  of  these  three  test  statistics  for  the  logit,  probit, 
and  tobit  models,  and  to  investigate  the  effects  of  the  differences  between 
bootstrapping  data  and  bootstrapping  residuals  on  these  hypothesis  tests.  We  use 
the  Hessian  W,  Hessian  LM,  and  LR  tests,  as  well  as  the  bootstrap  Bartlett 
adjusted  LR  test. 

5.2  Wald,  Likelihood  Ratio  and  Lagrange  Multiplier  tests 
The  Wald  approach  starts  at  the  alternative  and  asks  whether  movement 
toward  the  null  would  be  an  improvement.  This  involves  estimation  under  the 
alternative  and  the  value  hypothesized  under  the  null,  where  the  metric  is  the 
expected  value  of  the  Hessian  matrix  evaluated  under  the  alternative.  In  contrast, 
the  Lagrange  multiplier  approach  starts  at  the  null  and  considers  movement 
toward  the  alternative.  This  requires  evaluating  the  slope  of  the  log-likelihood 
function  (the  score)  when  the  parameters  are  constrained  to  the  space  of  the  null, 
where  the  metric  is  the  inverse  of  the  expected  value  of  the  Hessian  evaluated 
under  the  null.  Finally,  the  likelihood  ratio  approach  compares  the  two  hypotheses 
directly  on  an  equal  basis.  This  involves  estimating  the  model  under  both  the  null 
and  the  alternative  and  then  comparing  the  difference  in  the  log-likelihood 
functions.  Which  one  to  use  usually  depends  on  such  factors  as  small  sample 
behavior  or  computational  convenience. 

Let  A  be  the  vector  of  the  unrestricted  maximum  likelihood  (ML)  estimate 
of  parameters,   p  be  the  vector  of  the  restricted  maximum  likelihood  (ML)  estimate 
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of  parameters,  s(P)  be  the  gradient  of  the  log-likelihood  function,  and  I(P)  be  the 
information  matrix  of  a  model.  Then  for  the  null  hypothesis  H0:  p=p0,  the  Wald, 
likelihood  ratio,  and  Lagrange  multiplier  test  statistics  are 

W=  (p-p/W-pJ  (5.1) 

LR  =  2(lnL(p)-lnL(p))  (5-2) 

LM=  s'®M$]-ls®)  ■  (5-3] 

For  the  logit  model 


y< 


1  if  Po+Pi*u+P2^+ui  >  ° 

0  otherwise  . 


(5.4) 


Let  x,=(l  x„  xj  be  a  column  vector  and  p  be  a  3x1  column  vector.  The  log- 
likelihood  function  for  the  model  is 


/V  (V 

inL  =  £p'x(y(  -  Elni1+exP(M  •  (5-5) 

(- 1  i=  1 


We  consider  the  null  hypothesis  H0:  P2=p20,  leaving  P0  and  pj  to  be  nuisance 
parameters.  Hence,  the  gradient  of  the  log-likelihood  function  for  p2  is 

"  "      exptp'x)  fr-~ 

(=i  (=i   l  +  exp(P'x() 

and  from  the  second  derivative  of  the  log-likelihood  function  we  derive  the 
information  matrix 


"       exp(P'*,)  (57) 

£  [l  +  exp(P'x,)]2  ' 


Then  the  three  test  statistics  for  the  null  hypothesis  H0  are  as  follows: 
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w  =  (fc-WUPHMJ  (58) 

LR  =  2(lnL(p)-lnL(p))  (59) 

lm  =  simmvsl  s3(p) .  (5-1Q) 

For  the  probit  and  tobit  models,  there  will  be  similar  equations  from  (5.4) 
to  (5.10).  See  Maddala  (1983)  for  more  details. 

Three  bootstrap  methods  will  be  used:  the  first  is  Efron's  nonparametric 
bootstrapping  data  method;  the  second  is  the  parametric  bootstrapping  residuals 
method,  and  the  third  is  Flood's  augmented  bootstrap  method.  Details  of  the  first 
two  bootstrap  methods  have  been  presented  in  chapter  3;  details  for  the  third  were 
discussed  in  chapter  four.  Using  equations  (5.8)  to  (5.10),  we  can  compute  the 
bootstrap  test  statistics  W\  LR",  and  LM*. 

5.3  Monte  Carlo  Experiments  for  Hypothesis  Testing 
We  generate  the  data  from  the  logit  model  of  the  equation  (5.4).  Our  goal  is 
to  improve  the  accuracy  of  significance  levels  of  the  Wald,  likelihood  ratio,  and 
Lagrange  multiplier  tests  by  finding  bootstrap  critical  regions  instead  of  asymptotic 
chi-square  critical  regions.  Three  different  bootstrap  methods  are  applied,  Efron's 
nonparametric  bootstrapping  data  method,  the  parametric  bootstrap  method,  and 
Flood's  augmented  semiparametric  bootstrap  method.  The  model  has  two 
continuous  exogenous  variables  and  an  intercept.  The  true  parameter  values  are 
P0=0.4,  p!=1.0,  (32=1.0.  Both  continuous  variables  x,  and  x^  are  randomly  drawn 
from  the  uniform  distribution  over  the  range  (-2,2).  Approximate  40  percent  of  the 
observations  of  Y  are  censored  in  each  experiment.  Sample  sizes  of  N=50  and 
N=100  are  used.  The  null  hypothesis  is  H0:  (32=1.0  and  the  alternative  is  H,:  (32^1 .0. 


70 

Using  this  same  setting,  we  do  Monte  Carlo  experiments  for  the  probit  and  tobit 
models  with  the  error  term  of  the  tobit  model  generated  from  the  standard  normal 
distribution. 

The  detailed  procedure  for  the  logit  model  is  as  follows: 
Step  1:  Generate  a  random  sample  of  data  (Y,X)  of  size  N  according  to 

equation  (5.4)  with  error  generated  from  the  logistic  distribution. 
Step  2:  Obtain  restricted  ML  estimates2  and  unrestricted  ML  estimates  of 

the  model  from  sample  (Y,X).  Then  compute  the  W,  LR,  and  LM  test 

statistics  according  to  equations  from  (5.8)  to  (5. 1 0) .  Call  their  values 

W0,  LRq,  and  LM0. 
Step  3a:  Bootstrap  a  sample  (Y,X)  using  Efron's  nonparametric  bootstrap 

method   to  get   a  new  bootstrap   sample   (Y*X).  Then  estimate 

restricted  and  unrestricted  ML  estimates. 
Step  3b:  Generate  errors  from  the  normal  distribution  with  the  restricted  ML 

estimated  variance.  Then,  by  using  both  the  hypothesized  parameter 

values  of  the  null  and  restricted  ML  estimates  from  the  original 

sample  (Y,X),  we  can  obtain  Y*  to  get  a  new  parametric  bootstrap 

sample  (Y\X)  with  which  to  estimate  restricted  and  unrestricted  ML 

estimates. 
Step  3c:  For  the  augmented  error  method,  we  can  obtain  a  new  bootstrap 

sample  to  estimate  restricted  and  unrestricted  ML  estimates.  (Details 

of  the  procedure  are  presented  in  chapter  4). 
Step  4:  Compute  the  three  test  statistics  for  each  bootstrap  method.  Call 


2  For  the  tobit  model,  the  same  formulae  as  (4.6)  and  (4.7)  have  been  used  to 
estimate  approximately  restricted  ML  estimates,  by  cutting  down  one  dimension  for  p.  The 
estimates  are  not  exactly  restricted  ML  estimates. 
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their  values  W**b),  LRrb),  and  LM,,b). 
Step  5:  For  each  bootstrap  method,  estimate  the  a-level  critical  values  of 

these  three  tests  from  the  empirical  distribution  of  {w,lw} ,  {LR'tb]} , 
and  {LM*{b)}   that  are  obtained  by  repeating  step  3  and  step  4  B=  1 00 
times.  Let  Cw(a),  CLR(a),  and  CLM(a)  denote  the  estimated  critical 
values. 
Step  6:  For  each  bootstrap  method,  reject  the  model  being  tested  at  the 

nominal  a-level  based  on  the  bootstrap  critical  values  if  W0>Cw(a)  for 
the  Wald  test,  LR0>CLR(a)  for  the  LR  test,  and  LM0>CLM(a)  for  the  LM 
test.  Reject  the  model  at  the  nominal  a-level  based  on  the  asymptotic 
chi-square  critical  value  if  W0>%2(l-a)  for  the  Wald  test,   LR„>x2(l-a) 
for  the  LR  test,  and  LM0>x2(l-a)  for  the  LM  test  with  degree  of 
freedom  one. 
In  addition  to  the  test  statistics  of  the  Wald,  likelihood  ratio,  and  Lagrange 
multiplier,  we  also  consider  the  bootstrap  Bartlett  adjusted  LR  test.  This  procedure 
is  as  follows:  first  we  get  LR^,  from  the  original  sample  as  in  step  2.  Second,  we 
have  the  bootstrap  LR  test  statistic  LR*(b)  in  the  step  3's,  and  whose  average  LR* 
over  the  100  bootstraps  estimates  the  true  average  value  of  the  LR  statistic  under 
the  null  hypothesis.  Finally,  the  bootstrap  Bartlett  adjusted  LR  statistic  is 

LR^  =  L^/lk  (5.11) 

which  is  tested  against  a  chi-square  distribution  with  one  degree  of  freedom. 

From  tables  5.1  to  5.6,  we  give  the  actual  percentages  of  rejection,  i.e.  the 
true  significance  levels,  for  seven  different  nominal  levels.  We  also  provide  the 
absolute  values  of  the  relative  change  of  the  true  level  away  from  the  nominal  level 
in  parentheses  under  the  true  significance  levels  for  each  entry.  For  instance,  if 
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the  nominal  level  is  a=0.10  (or  10%),  and  the  true  significance  level  is  0.088  (or 
8.8%),  then  the  absolute  value  of  the  relative  change  of  the  true  level  away  from 
the  nominal  level  is  0. 12  (or  12%  in  the  tables)  because  the  absolute  value  of  the 
difference  between  0. 10  and  0.088  divided  by  0. 10  is  0. 12.  In  table  5.7,  we  add  all 
the  relative  changes  of  the  seven  columns.  We  exclude  the  asymptotic  Wald  test 
from  this  averaging  because  of  its  bad  results.  We  also  exclude  the  discussion  of 
Efron's  nonparametric  bootstrap  method  because  of  its  extremely  unsatisfactory 
results  for  all  the  Monte  Carlo  experiments. 

The  purpose  of  these  Monte  Carlo  experiments  is  to  find  the  best  hypothesis 
test  with  the  correct  true  significance  levels  for  each  model;  to  see  if  applying 
bootstrap  methods  improves  hypothesis  testing  over  the  Wald,  likelihood  ratio,  and 
Lagrange  multiplier  tests;  to  investigate  the  problem  of  conflict  between  these 
three  tests  in  these  three  limited  dependent  variable  models;  and  finally,  to  see  the 
differences  between  the  bootstrapping  data  method  and  the  bootstrapping 
residuals  method  for  these  hypothesis  tests. 

For  the  logit  model  (table  5.7),  Flood's  augmented  bootstrap  method  does 
not  yield  accurate  true  significance  levels.  It  has  an  average  of  about  41.6% 
(66.2%)  for  the  relative  changes  of  the  true  levels  away  from  the  nominal  levels 
with  sample  size  N=50  (N=100).  This  method  consistently  underestimates  the 
nominal  levels  by  a  large  margin.  The  reason  for  this  might  be  that  the  augmented 
bootstrap  method  forces  the  error  terms  to  be  symmetric  in  each  bootstrap 
sample.  The  probability  of  conflict  between  the  tests  based  on  Flood's  augmented 
bootstrap  critical  values  is  small,  an  average  of  0.019  for  both  sample  sizes  (table 
5.10). 

All  the  tests  with  sample  size  N=50  generated  from  the  parametric  bootstrap 
method  consistently  overestimate  the  nominal  levels  by  an  average  of  23.6%  (table 
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5.2  and  5.7).  All  the  tests  with  sample  size  N=100  generated  from  the  parametric 
bootstrap  method  have  satisfactory  performance  with  litUe  overestimation.  We  can 
see  from  table  5.10  that  the  probability  of  conflict  between  the  tests  based  on 
parametric  bootstrap  critical  values  is  small,  an  average  of  0.01 1  for  both  sample 
sizes.  The  probability  of  conflict  between  these  three  tests  based  on  asymptotic 
chi-square  critical  values  is  moderate,  an  average  about  0.05  for  both  sample 
sizes.  Overall  for  the  logit  model,  the  LM  test  with  the  asymptotic  chi-square 
critical  values  gives  the  most  accurate  true  significance  levels.  And  the  Wald  test 
with  the  parametric  bootstrap  critical  values,  as  well  as  the  parametric  bootstrap 
Bartlett  adjusted  LR  test  with  the  chi-square  critical  values,  gives  satisfactory  true 
significance  levels. 

For  the  probit  model  (table  5.7),  all  four  tests  with  the  small  sample  N=50 
generated  from  both  the  parametric  bootstrap  method  and  Flood's  augmented 
bootstrap  method  give  similar  results,  with  an  average  of  35%  relative  change.  At 
a  sample  size  of  100,  the  tests  generated  from  the  parametric  bootstrap  method 
give  satisfactory  results,  specially  for  the  Wald,  LR,  and  bootstrap  Bartlett 
adjusted  LR  tests.  For  both  sample  sizes,  however,  the  LM  test  with  asymptotic 
chi-square  critical  values  gives  the  most  accurate  true  significance  levels,  and  the 
LM  test  generated  from  the  parametric  bootstrap  method  gives  satisfactory  results. 
The  average  probability  of  conflict  (table  5. 10)  between  these  three  tests  generated 
from  the  parametric  bootstrap  method  is  about  0.039  with  a  maximum  of  0.068 
for  sample  size  of  50  and  only  about  0.013  with  a  maximum  of  0.024  for  sample 
size  of  100. 

For  the  tobit  model,  the  tests  based  on  asymptotic  chi-square  critical  values 
substantially  over-reject  the  null  hypothesis  (tables  5.5  to  5.7).  The  true 
significance  levels  are  much  larger  than  the  nominal  levels  by  an  average  relative 
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change  of  73%  (62%)  for  sample  size  N=50  (N=100).  For  sample  sizes  of  both  50 
and  100,  the  Wald,  LR,  and  LM  tests  generated  from  both  bootstrap  methods,  the 
augmented  semiparametric  bootstrap  method  and  the  parametric  bootstrap 
method,  give  satisfactory  true  significance  levels,  with  the  performance  of  the 
augmented  bootstrap  method  better  on  average  (tables  5.5  to  5.7).  The  bootstrap 
Bartlett  adjusted  LR  test  performs  well  in  the  large  sample.  For  the  tobit  model, 
the  Wald,  LR,  and  LM  tests  are  almost  equivalent  whichever  sample  size  we  choose 
and  whichever  bootstrap  method  we  apply.  The  maximum  probability  of  conflict 
between  these  three  tests  in  table  5.10  is  0.01  with  an  average  of  0.004  over  five 
hundred  replications.  Even  for  the  badly  behaved  chi-square  approximations,  the 
average  probability  of  conflict  is  only  0.021  out  of  five  hundred  replications  for 
both  sample  sizes.  Therefore,  for  testing  in  the  tobit  model,  these  Monte  Carlo 
experiments  suggest  use  of  the  Wald,  LR,  and  LM  tests  with  the  augmented 
bootstrap  based  critical  values. 

For  the  Wald,  LR,  and  LM  tests  based  on  asymptotic  chi-square  critical 
values,  there  are  large  probabilities  of  conflict  (table  5.10).  On  average  the 
probability  of  conflict  is  about  0.05  for  the  logit  model  and  0.145  for  the  probit 
model.  From  table  5.7,  we  can  see  that  the  Wald  test  with  the  chi-square  critical 
region  substantially  over-rejects  the  null  hypotheses  for  all  three  models  with 
average  relative  change  of  1 10%.  All  three  tests  perform  poorly  in  the  tobit  model. 
The  LR  test  performs  well  in  both  the  logit  and  probit  models  with  sample  size  of 
100  of  average  relative  change  about  9.6%.  And  it  over-rejects  the  null  hypothesis 
in  both  the  logit  and  probit  models,  where  with  a  sample  size  of  50,  the  average 
relative  change  about  33.7%.  The  LM  test  performs  excellenUy  in  both  the  logit 
and  probit  models.  This  result  coincides  with  the  conclusion  of  Davidson  and 
MacKinnon  (1984). 
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The  bootstrap  Bartlett  adjusted  LR  test  performs  well  only  in  large  samples, 
which  is  not  the  advantage  of  bootstrap  methods  (table  5.7). 

For  two  bootstrap  methods,  we  can  see  from  table  5.7  that  overall  tests 
using  the  parametric  bootstrap  based  critical  values  perform  better  than  tests 
using  augmented  bootstrap  based  critical  values.  The  latter  performs  satisfactorily 
only  in  the  tobit  model.  For  the  augmented  bootstrap  method,  it  is  strange  that  the 
tests  perform  better  in  small  samples  than  in  large  samples.  As  the  sample  size 
increases  to  100  in  both  the  logit  and  probit  models,  the  two  methods  give 
opposite  results:  the  parametric  bootstrap  method  generates  better  tests  with  a 
small  average  relative  change,  and  the  augmented  bootstrap  method  generates 
worse  tests  with  a  large  average  relative  change. 

From  tables  5. 1  through  5.6,  we  get  table  5.9  by  summing  each  percentage 
column  of  absolute  values  of  relative  changes  of  the  true  levels  to  the  nominal 
levels  for  each  table.  We  can  see  that  the  1%  column  is  very  sensitive  and  difficult 
to  match,  while  the  50%  column  is  stable  and  the  easiest  to  match.  Since  there  are 
a  lot  of  significantly  large  relative  changes  in  an  absolute  sense  in  the  1%  column, 
we  can  create  a  new  table  5.8  by  removing  the  1%  column  from  table  5.7.  For  the 
parametric  bootstrap  method  in  table  5.8,  more  than  30%  of  the  total  sums  of 
relative  changes  have  been  reduced  for  each  of  these  three  tests  with  respect  to 
table  5.7.  The  same  results  hold  for  the  asymptotic  chi-square  distribution  in  table 
5.8.  This  means  that  it  is  really  difficult  for  these  two  methods  to  accurately 
estimate  the  0.01  nominal  level,  and  relatively  much  easier  to  estimate  the  other 
nominal  levels. 
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5.4   Summary 

There  are  many  hypothesis  tests  that  can  be  used  for  econometric  models, 
but  for  each  hypothesis  there  should  be  only  one  best  test.  With  these  restricted 
Monte  Carlo  experiments,  we  can  suggest  that  for  the  logit  and  probit  models,  the 
LM  test  based  on  chi-square  critical  values  provides  accurate  true  significance 
levels.  For  the  tobit  model,  the  Wald,  LR,  and  LM  tests  using  Flood's  augmented 
bootstrap  based  critical  values  are  all  equivalent  and  provide  accurate  true 
significance  levels.  For  these  three  models,  the  Wald  test  using  chi-square  based 
critical  values  always  substantially  over-rejects  the  null  hypothesis. 

Since  we  are  testing  parametric  models  with  parametric  estimation  methods 
and  parametric  hypothesis  tests,  not  surprisingly,  the  tests  generated  using 
Efron's  nonparametric  bootstrapping  of  data  give  us  useless  results  in  our  Monte 
Carlo  experiments. 

The  bootstrap  Bartlett  adjusted  likelihood  ratio  test  does  not  perform  as 
well  as  expected  in  small  samples. 

With  the  parametric  bootstrap  method  and  the  augmented  bootstrap 
method,  the  probabilities  of  conflict  between  the  Wald,  LR,  and  LM  tests  are  of  no 
consequence  for  the  logit  and  probit  models,  and  especially  for  the  tobit  model. 
However,  when  these  three  tests  use  chi-square  based  critical  values,  there  are 
some  substantial  conflicts  among  them  in  the  logit  and  probit  models. 
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Table  5.1:  True  significance  levels  with  absolute  values  of  percentage 
differences  between  the  true  levels  and  the  nominal  levels  in 
parentheses  for  the  logit  model  of  sample  size  N=50  and  M=500. 


NOMINAL 

1% 

5% 

10% 

15% 

20% 

25% 

50% 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

1.6 
(60) 

6.0 

(20) 

10.4 

(4) 

16.6 
(ID 

22.8 
(14) 

27.6 
(10) 

54.8 
(10) 

LR 

2.2 

(120) 

6.2 

(24) 

12.0 
(20) 

17.6 
(17) 

22.2 
(11) 

26.6 
(6) 

55.0 
(10) 

LR-AD 

1.0 
(0) 

5.6 

(12) 

11.0 
(10) 

15.2 
(1) 

21.4 
(7) 

26.6 
(6) 

54.2 
(8) 

LM 

2.4 
(140) 

7.2 

(44) 

13.0 
(30) 

18.0 
(20) 

24.4 
(22) 

28.8 

(15) 

54.6 
(9) 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

0.2 

(80) 

2.4 

(52) 

5.4 

(46) 

7.8 
(48) 

11.2 
(44) 

13.2 
(47) 

36.2 

(28) 

LR 

0.4 
(60) 

3.4 
(32) 

5.2 

(48) 

9.2 

(39) 

12.2 
(39) 

14.2 
(43) 

37.6 
(25) 

LR-AD 

0.0 

(100) 

2.0 

(60) 

4.8 
(52) 

6.8 

(55) 

12.8 
(36) 

14.0 
(44) 

41.6 
(17) 

LM 

1.0 
(0) 

3.8 

(24) 

7.8 
(22) 

11.0 
(27) 

13.6 
(32) 

16.6 

(34) 

35.2 
(30) 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

WALD 

2.6 

(160) 

10.8 
(116) 

16.2 
(62) 

20.2 
(35) 

29.4 
(47) 

33.6 
(34) 

58.6 
(17) 

LR 

1.6 
(60) 

7.8 

(56) 

12.6 
(22) 

17.6 
(17) 

25.0 
(25) 

28.8 
(15) 

56.2 
(12) 

LM 

1.0 
(0) 

5.0 

(0) 

11.2 

(12) 

16.0 

(7) 

24.0 
(20) 

27.6 
(10) 

55.0 
(10) 
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Table  5.2:  True  significance  levels  with  absolute  values  of  percentage 
differences  between  the  true  levels  and  the  nominal  levels  in 
parentheses  for  the  logit  model  of  sample  size  N=100  and  M=500. 


NOMINAL 

1% 

5% 

10% 

15% 

20% 

25% 

50% 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

1.0 

5.8 

11.2 

15.6 

20.8 

26.8 

51.0 

(0) 

(16) 

(12) 

(4) 

(4) 

(7) 

(2) 

LR 

1.2 

5.8 

11.4 

15.8 

20.6 

27.2 

51.6 

(20) 

(16) 

(14) 

(5) 

(3) 

(9) 

(3) 

LR-AD 

0.2 

4.6 

10.8 

14.2 

20.4 

26.4 

51.0 

(80) 

(8) 

(8) 

(5) 

(2) 

(6) 

(2) 

LM 

1.4 

4.6 

11.0 

14.8 

20.4 

27.8 

51.6 

(40) 

(8) 

(10) 

(1) 

(2) 

(11) 

(3) 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

0.0 

0.6 

2.2 

4.8 

7.8 

9.2 

23.8 

(100) 

(88) 

(78) 

(68) 

(61) 

(63) 

(52) 

LR 

0.2 

1.0 

2.6 

6.0 

8.2 

10.0 

25.2 

(80) 

(80) 

(74) 

(60) 

(59) 

(60) 

(50) 

LR-AD 

0.0 

1.0 

2.4 

4.8 

7.8 

10.0 

32.8 

(100) 

(80) 

(76) 

(68) 

(61) 

(60) 

(34) 

LM 

0.4 

1.8 

3.4 

7.0 

9.6 

10.6 

26.2 

(60) 

(64) 

(66) 

(53) 

(52) 

(58) 

(48) 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

WALD 

1.3 

9.4 

16.2 

21.4 

28.6 

34.2 

56.2 

(30) 

(88) 

(62) 

(43) 

(43) 

(37) 

(12) 

LR 

0.4 

4.8 

11.4 

15.2 

21.4 

27.0 

51.8 

(60) 

(4) 

(14) 

(1) 

(7) 

(8) 

(4) 

LM 

0.6 

3.8 

10.6 

14.4 

21.2 

26.4 

52.0 

(40) 

(24) 

(6) 

(4) 

(6) 

(6) 

(4) 
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Table  5.3:  True  significance  levels  with  absolute  values  of  percentage 
differences  between  the  true  levels  and  the  nominal  levels  in 
parentheses  for  the  probit  model  of  sample  size  N=50  and  M=500. 


NOMINAL 

1% 

5% 

10% 

15% 

20% 

25% 

50% 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

0.0 

0.4 

4.4 

10.8 

18.6 

26.4 

50.6 

(100) 

(92) 

(56) 

(28) 

(7) 

(6) 

(1) 

LR 

0.0 

1.4 

5.2 

13.2 

22.0 

28.0 

52.4 

(100) 

(72) 

(48) 

(12) 

(10) 

(12) 

(5) 

LR-AD 

0.0 

0.8 

2.8 

9.6 

19.4 

26.8 

52.2 

(100) 

(84) 

(72) 

(36) 

(3) 

(7) 

(4) 

LM 

1.4 

4.0 

9.4 

17.6 

23.8 

29.6 

52.6 

(40) 

(20) 

(6) 

(17) 

(19) 

(18) 

(5) 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

0.0 

1.2 

7.8 

17.0 

24.6 

30.2 

47.4 

(100) 

(76) 

(22) 

(13) 

(23) 

(21) 

(5) 

LR 

0.2 

1.6 

7.2 

18.0 

24.6 

29.4 

47.2 

(80) 

(68) 

(28) 

(20) 

(23) 

(18) 

(6) 

LR-AD 

0.0 

1.2 

6.4 

11.4 

22.4 

28.6 

48.8 

(100) 

(76) 

(36) 

(24) 

(12) 

(14) 

(2) 

LM 

0.4 

2.0 

7.8 

17.8 

23.4 

27.0 

47.2 

(60) 

(60) 

(22) 

(19) 

(17) 

(8) 

(6) 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

WALD 

12.8 

22.8 

29.8 

34.8 

41.2 

43.6 

63.6 

(1180) 

(356) 

(198) 

(132) 

(102) 

(74) 

(27) 

LR 

1.4 

8.4 

14.4 

20.8 

27.0 

32.0 

55.4 

(40) 

(68) 

(44) 

(39) 

(35) 

(28) 

(11) 

LM 

0.4 

4.4 

7.8 

13.4 

20.8 

26.8 

54.2 

(60) 

(12) 

(22) 

(11) 

(4) 

(7) 

(8) 
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Table  5.4:  True  significance  levels  with  absolute  values  of  percentage 
differences  between  the  true  levels  and  the  nominal  levels  in 
parentheses  for  the  probit  model  of  sample  size  N=100  and  M=500. 


NOMINAL 

1% 

5% 

10% 

15% 

20% 

25% 

50% 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

1.4 
(40) 

4.4 
(12) 

8.6 

(14) 

14.0 

(7) 

18.6 
(7) 

24.6 
(2) 

49.8 
(0) 

LR 

1.4 
(40) 

4.6 

(8) 

8.8 
(12) 

15.0 
(0) 

19.2 
(4) 

25.0 
(0) 

50.4 
(1) 

LR-AD 

0.6 

(40) 

4.8 

(4) 

8.6 
(14) 

13.2 
(12) 

19.0 
(5) 

23.0 
(8) 

50.6 
(1) 

LM 

2.2 

(120) 

6.0 

(20) 

11.0 
(10) 

15.4 
(3) 

20.0 
(0) 

25.0 
(0) 

49.6 
(1) 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

3.8 
(280) 

6.4 

(28) 

9.0 
(10) 

11.8 
(21) 

16.4 
(18) 

19.4 
(22) 

34.4 
(31) 

LR 

2.4 

(140) 

6.2 

(24) 

7.2 
(28) 

11.0 
(27) 

14.0 
(30) 

17.8 
(29) 

33.6 
(33) 

LR-AD 

1.8 
(80) 

5.0 

(0) 

9.0 

(10) 

10.4 
(31) 

14.4 
(28) 

18.0 
(28) 

37.0 
(26) 

LM 

1.6 
(60) 

4.4 
(12) 

6.4 

(36) 

9.0 

(40) 

11.8 
(41) 

15.2 
(39) 

31.2 
(38) 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

WALD 

5.0 
(400) 

14.4 
(188) 

22.0 
(210) 

26.2 

(75) 

32.8 
(64) 

36.8 
(47) 

62.8 
(26) 

LR 

1.0 

(0) 

4.4 
(12) 

9.0 

(10) 

15.0 
(0) 

20.8 
(4) 

26.4 

(6) 

52.4 
(5) 

LM 

0.8 
(20) 

2.8 

(44) 

8.2 

(18) 

11.8 
(21) 

20.0 
(0) 

23.4 
(6) 

51.2 
(2) 
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Table  5.5:  True  significance  levels  with  absolute  values  of  percentage 
differences  between  the  true  levels  and  the  nominal  levels  in 
parentheses  for  the  tobit  model  of  sample  size  N=50  and  M=500. 


NOMINAL 

1% 

5% 

10% 

15% 

20% 

25% 

50% 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

0.8 

3.0 

7.6 

15.6 

19.6 

24.8 

52.0 

(20) 

(40) 

(24) 

(4) 

(2) 

(1) 

(4) 

LR 

0.8 

3.2 

8.0 

15.6 

19.4 

25.0 

51.8 

(20) 

(36) 

(20) 

(4) 

(3) 

(0) 

(4) 

LR-AD 

0.8 

2.6 

6.6 

11.6 

16.2 

20.4 

48.4 

(20) 

(48) 

(34) 

(23) 

(19) 

(18) 

(3) 

LM 

0.8 

2.6 

7.8 

15.0 

19.6 

25.0 

51.6 

(20) 

(48) 

(22) 

(0) 

(2) 

(0) 

(3) 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 

WALD 

1.0 

4.4 

9.2 

15.0 

19.6 

24.6 

50.6 

(0) 

(12) 

(8) 

(0) 

(2) 

(2) 

(1) 

LR 

1.4 

4.8 

9.0 

15.0 

19.6 

24.6 

50.6 

(40) 

(4) 

(10) 

(0) 

(2) 

(2) 

(1) 

LR-AD 

1.6 

4.6 

6.8 

11.2 

15.8 

19.6 

46.6 

(60) 

(8) 

(32) 

(25) 

(21) 

(22) 

(7) 

LM 

1.0 

4.4 

8.8 

14.4 

19.0 

24.4 

50.8 

(0) 

(12) 

(12) 

(4) 

(5) 

(2) 

(2) 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

WALD 

3.0 

7.4 

14.6 

19.2 

25.0 

28.8 

56.2 

(200) 

(48) 

(46) 

(28) 

(25) 

(15) 

(13) 

LR 

3.4 

9.2 

16.6 

21.6 

27.8 

31.2 

58.0 

(240) 

(84) 

(66) 

(44) 

(39) 

(25) 

(16) 

LM 

4.2 

10.0 

18.2 

22.4 

28.0 

33.0 

59.8 

(320) 

(100) 

(82) 

(49) 

(40) 

(32) 

(20) 
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Table  5.6:  True  significance  levels  with  absolute  values  of  percentage 
differences  between  the  true  levels  and  the  nominal  levels  in 
parentheses  for  the  tobit  model  of  sample  size  N=100  and  M=500. 


NOMINAL 


1% 


5% 


10% 


15% 


20% 


25% 


50% 


PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 


WALD 

LR 
LR-AD 

LM 


0.6 

(40) 

0.8 
(20) 

1.0 
(0) 

0.6 

(40) 


4.8 
(4) 

5.4 
(8) 

4.4 
(12) 

5.0 

(0) 


11.4 
(14) 

11.0 
(10) 

10.0 

(0) 

11.0 
(10) 


16.8 
(12) 

17.0 
(13) 

14.2 
(5) 

17.2 

(15) 


21.8 
(9) 

21.8 

(9) 

21.6 
(8) 

22.0 

(10) 


26.8 
(7) 

26.2 
(5) 

25.4 
(2) 

27.0 

(8) 


WALD 

LR 
LR-AD 

LM 


1.0 
(0) 

1.0 
(0) 

1.0 
(0) 

1.0 
(0) 


4.6 
(8) 

4.8 

(4) 

4.4 
(12) 

4.4 
(12) 


9.6 

(4) 

10.4 

(4) 

9.6 

(4) 

10.0 

(0) 


15.8 
(5) 

15.6 

(4) 

13.2 
(12) 

15.8 

(5) 


21.2 
(6) 

22.2 
(11) 

19.8 
(1) 

21.6 

(8) 


26.4 
(6) 

26.6 
(6) 

25.6 
(2) 

26.4 
(6) 


52.6 
(5) 

53.0 
(6) 

50.8 
(2) 

52.8 
(6) 


FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 


52.0 

(4) 

52.0 

(4) 

51.8 
(4) 

52.2 

(4) 


ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 


WALD 

1.8 

9.4 

15.8 

22.4 

30.0 

34.4 

59.6 

(80) 

(88) 

(58) 

(49) 

(50) 

(38) 

(20) 

LR 

2.0 

9.4 

16.0 

22.2 

29.6 

33.8 

59.6 

(100) 

(88) 

(60) 

(48) 

(48) 

(35) 

(20) 

LM 

2.8 

10.2 

17.6 

23.6 

30.4 

34.6 

60.4 

(180) 

(104) 

(76) 

(57) 

(52) 

(38) 

(21) 
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Table  5.7:  Total  of  absolute  values  of  percentage  differences  between  the  true 
levels  and  the  nominal  levels  with  each  test  for  different  methods  as 
M=500. 


MODEL 

LOGIT 

PROBIT 

TOBIT 

N 

50                100 

50                100 

50                100 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 


WALD 

129(0) 

45(0) 

-290(2) 

-82(1) 

-95(2) 

91(2) 

LR 

208(0) 

70(0) 

-259(3) 

65(3) 

-87(2) 

71(1) 

LR-AD 

44(0) 

-111(4) 

-306(2) 

-84(1) 

-165(0) 

-29(3) 

LM 

280(0) 

-75(5) 

-125(4) 

154(1) 

-95(1) 

89(1) 

AVERAGE 

23.6 

10.8 

35.0 

13.8 

15.8 

10.0 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 


WALD 

-345(0) 

-510(0) 

-260(3) 

410(5) 

-25(1) 

33(2) 

LR 

-286(0) 

-463(0) 

-243(3) 

3 1 1  (5) 

59(4) 

33(1) 

LR-AD 

-364(0) 

-479(0) 

-264(2) 

-203(1) 

-175(1) 

-35(2) 

LM 

-169(0) 

-401(0) 

-192(3) 

-266(1) 

-37(1) 

35(1) 

AVERAGE 

41.6 

66.2 

34.3 

42.5 

10.6 

4.9 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

WALD 

471(0)          315(0) 

2069(0)        1010(0) 

375(0) 

383(0) 

LR 

207(0)          -98(5) 

265(0)          -37(3) 

514(0) 

399(0) 

LM 

59(0)           -90(4) 

-124(3)        -111(1) 

643(0) 

528(0) 

AVERAGE" 

19.0             13.4 

27.8             10.6 

73.0 

62.4 

average  absolute  values  of  percentage  differences  of  LR  and  LM  tests  only. 
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Table  5.8:  Total  of  absolute  values  of  percentage  differences  between  the  true 
levels  and  the  nominal  levels  with  each  test  for  different  methods 
without  1%  column  as  M=500. 


MODEL 

LOG  IT 

PROBIT 

TOBIT 

N 

50                100 

50                100 

50                100 

PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 


WALD 

69(0) 

45(0) 

-190(2) 

-42(0) 

-75(2) 

51(1) 

LR 

88(0) 

50(0) 

-159(3) 

-25(1) 

-67(2) 

51(1) 

LR-AD 

44(0) 

31(2) 

-206(2) 

-44(1) 

-145(0) 

-29(3) 

LM 

140(0) 

35(1) 

85(2) 

34(1) 

-75(1) 

49(1) 

AVERAGE 

14.2 

6.7 

26.7 

6.0 

15.0 

7.5 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 


WALD 

-265(0) 

-410(0) 

-160(3) 

-130(1) 

-25(1) 

33(2) 

LR 

-226(0) 

-383(0) 

-163(3) 

-171(1) 

-19(1) 

33(1) 

LR-AD 

-264(0) 

-379(0) 

-164(2) 

-123(0) 

-115(0) 

-35(2) 

LM 

-169(0) 

-341(0) 

-132(3) 

-206(1) 

-37(1) 

35(2) 

AVERAGE 

38.5 

63.0 

25.8 

26.3 

8.2 

5.7 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 


WALD 

311(0) 

285(0) 

889(0) 

610(0) 

175(0) 

303(0) 

LR 

147(0) 

38(1) 

225(0) 

-37(3) 

274(0) 

299(0) 

LM 

59(0) 

-50(4) 

-64(3) 

-9KD 

323(0) 

348(0) 

AVERAGE' 

17.2 

7.3 

24.1 

10.7 

42.9 

52.8 

average  absolute  values  of  percentage  differences  of  LR  and  LM  tests  only. 
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Table  5.9:       Total  of  each  percentage  column  of  absolute  values  of  relative  change 
of  true  level  to  nominal  level  for  each  table  from  5. 1  to  5.6. 


PERCENT 

LOG  IT 

PROBIT 

TOBIT 

N 

50      100 

50       100 

50 

100 

1% 

780      610 

1960     1220 

940 

460 

5% 

440      486 

984      352 

440 

340 

10% 

328      420 

554      372 

354 

240 

15% 

267      312 

351      237 

181 

225 

20% 

297      300 

255      201 

160 

212 

25% 

264      265 

213      188 

119 

153 

50% 

176      214 

80      164 

74 

96 
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Table  5.10:     The  maximum  probability  of  conflict  for  each  different  method 
among  three  tests  in  three  models.  All  entries  are  in  percentage. 


NOMINAL 


1% 


5% 


10%       15%       20%        25%        50% 


PARAMETRIC  BOOTSTRAPPING  OF  RESIDUALS 


50 

0.8 

1.2 

2.6 

1.4 

2.2 

2.2 

0.4 

LOG  IT 

100 

0.4 

1.2 

0.4 

1.0 

0.4 

1.0 

0.6 

50 

1.4 

3.6 

5.0 

6.8 

5.2 

3.2 

2.0 

PROBIT 

100 

0.8 

1.6 

2.4 

1.4 

1.4 

0.4 

0.8 

50 

0.0 

0.6 

0.4 

0.6 

0.2 

0.2 

0.4 

TOBIT 

100 

0.2 

0.6 

0.4 

0.4 

0.2 

0.8 

0.4 

FLOOD'S  AUGMENTED  BOOTSTRAPPING  OF  RESIDUALS 


LOG  IT 

50 
100 

0.8 
0.4 

1.4 
1.2 

2.6 
1.2 

3.2 
2.2 

2.4 
1.6 

3.4 
1.4 

2.4 
2.4 

PROBIT 

50 
100 

0.4 
2.2 

0.8 
2.0 

0.6 
2.6 

1.0 
2.8 

1.2 
4.6 

3.2 
4.2 

0.2 

3.2 

TOBIT 

50 
100 

0.4 
0.0 

0.4 
0.4 

0.4 
0.8 

0.6 

0.2 

0.6 

1.0 

0.2 
0.2 

0.2 

0.2 

ASYMPTOTIC  CHI-SQUARE  DISTRIBUTION 

LOGIT 

50 
100 

1.6 
0.7 

5.8 
5.6 

5.0 
5.6 

4.2 
7.0 

5.4 
7.4 

6.0 

7.8 

3.6 

4.4 

PROBIT 

50 
100 

12.4 
4.2 

18.4 
11.6 

22.0 
13.8 

21.4 
14.4 

20.4 
12.8 

16.8 
13.4 

9.4 
11.6 

TOBIT 

50 
100 

1.2 
1.0 

2.6 

0.8 

3.6 
1.8 

3.2 
1.4 

3.0 
0.8 

4.2 
0.8 

3.6 
0.8 

CHAPTER  6 
CONCLUSIONS 

This  dissertation  extends  Hall's  (1992)  short  bootstrap  confidence  intervals 
to  the  quasi-pivotal  method  which  provides  a  further  correction  method  for 
generating  bootstrap  confidence  intervals.  It  is  shown  theoretically  and  empirically 
through  Monte  Carlo  experiments  that,  among  all  methods  generating  bootstrap 
confidence  intervals,  the  bootstrap  quasi-pivotal  method  is  the  best.  This  method 
would  also  be  a  useful  correction  method  as  long  as  the  confidence  interval 
generating  method  uses  the  percentile  method.  The  results  based  on  the  correction 
method  are  better  than  those  based  on  uncorrected  methods. 

Also  the  generation  of  a  real  bootstrap  confidence  interval  is  proposed.  This 
performs  very  satisfactorily  when  the  underlying  distribution  is  symmetric. 
However,  in  reality,  most  underlying  distributions  are  asymmetric.  Therefore, 
bootstrap  confidence  intervals,  which  are  most  useful  in  small  samples,  should  be 
generated  by  the  quasi-pivotal  method.  A  method  of  finding  a  bootstrap  trimmed 
mean  is  proposed,  which  yields  satisfactory  results  for  a  linear  regression  model 
with  a  symmetric  error  distribution. 

For  the  logit  model,  the  parametric  bootstrap  method  with  the  logit 
maximum  likelihood  estimation  method  provides  the  best  reliable  estimates  among 
other  bootstrap  estimates,  and  gives  satisfactory  significance  levels  for  tests  of 
hypotheses. 
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For  the  probit  model,  the  parametric  bootstrap  method  with  the  probit 
maximum  likelihood  estimation  method  gives  the  most  reliable  estimates  among 
other  bootstrap  estimates,  and  also  yields  satisfactory  significance  levels  for  tests 
of  hypotheses,  though  only  at  a  sample  size  100  or  larger. 

For  the  tobit  model,  the  parametric  bootstrap  method  and  the  augmented 
bootstrap  method  are  equivalent  in  estimation,  and  both  give  satisfactory 
significance  levels  for  tests  of  hypotheses. 

For  hypothesis  testing  in  the  logit  and  probit  models,  the  Lagrange 
multiplier  test  based  on  chi-square  critical  values  provides  accurate  true 
significance  levels.  For  testing  in  the  tobit  model,  the  Wald,  likelihood  ratio,  and 
Lagrange  multiplier  tests  using  the  augmented  semiparametric  bootstrap  based 
critical  values  are  equivalent  and  provide  the  best  true  significance  levels.  The 
Wald  test  using  chi-square  based  critical  values  always  substantially  overestimates 
the  nominal  levels.  The  bootstrap  BarUett  adjusted  likelihood  ratio  test  does  not 
perform  as  well  as  expected  in  small  samples. 

With  the  parametric  bootstrap  method  and  the  augmented  bootstrap 
method,  the  probabilities  of  conflict  between  the  Wald,  likelihood  ratio,  and 
Lagrange  multiplier  tests  are  of  almost  no  consequence  for  either  the  logit,  probit, 
or  tobit  models. 

In  the  logit  model,  the  nonparametric  bootstrap  method  gives  a  small  mean 
squared  errors  of  (3's  (MSE(1)  because  of  a  small  residual  sum  of  squares  of  P's 
(RSSp),  but  gives  a  significantly  large  sum  of  squared  residuals  of  Y  (SSRY),  which 
causes  large  variation  of  the  prediction.  That  is,  the  nonparametric  bootstrap 
method  does  not  produce  reliable  estimates  for  the  logit  model. 

In  the  probit  model,  the  nonparametric  bootstrap  method  gives  small 
MSEp's  because  of  small  biases  and  small  RSS^'s,  but  gives  a  significantly  large 
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SSRY,  which  causes  large  variation  of  the  prediction.  That  is,  the  nonparametric 
bootstrap  method  does  not  produce  reliable  estimates  for  the  probit  model. 

In  both  the  tobit  II  and  tobit  III  models,  the  nonparametric  bootstrap 
method  has  a  large  bias  for  E(Y),  and  a  significantly  large  SSRY.  This  causes  large 
variation  of  the  prediction  from  the  model.  That  is,  the  nonparametric  bootstrap 
method  does  not  produce  reliable  estimates  for  both  the  tobit  II  and  tobit  III 
models. 

In  the  linear  regression  model,  the  nonparametric  bootstrap  method  gives 
a  better1  mean  SSRY,  and  the  coverage  of  its  confidence  interval  is  almost  the 
same  as  the  coverage  of  the  confidence  interval  generated  by  the  parametric 
bootstrap  method.  But  with  the  other  two  criteria  for  confidence  interval,  the 
nonparametric  bootstrap  method  does  not  give  an  interval  estimate  as  good  as  the 
one  generated  from  the  parametric  bootstrap  method. 

In  hypothesis  testing,  the  nonparametric  bootstrap  method  yields  useless 
test  statistics  for  the  Wald,  likelihood  ratio,  and  Lagrange  multiplier  in  the  logit, 
probit,  and  tobit  models. 

We  use  the  efficient  parametric  estimation  method,  maximum  likelihood 
estimation  (MLE)  method,  to  estimate  the  parametric  models  of  the  logit,  the 
probit,  and  the  tobit.  We  use  the  efficient  nonparametric  estimation  method, 
ordinary  least  squares  (OLS)  estimation  method,  to  estimate  the  linear  regression 
model.  By  using  the  nonparametric  bootstrap  method  with  the  MLE,  we  do  no 
have  a  reliable  point  estimate  because  of  the  significantly  large  SSRY.  By  using  the 
nonparametric  bootstrap  method  with  OLS,  we  obtain  a  reliable  point  estimate 
because  of  the  better  mean  SSRY,  but  we  still  do  not  have  a  good  interval  estimate. 


1      The  nonparametric  bootstrap  method  gives  the  closer  MSEy  to  the  true  value  than 
the  parametric  bootstrap  method  does. 
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That  is  to  say,  the  nonparametric  bootstrap  method  performs  better  if  the  efficient 
estimation  method  is  nonparametric. 

However,  by  using  the  parametric  bootstrap  method  with  efficient 
estimation  methods,  we  obtain  reliable  point  estimates  in  the  estimation  of  the 
logit,  probit,  tobit,  and  linear  regression  models;  we  have  reliable  interval  estimate 
in  the  estimation  of  the  linear  regression  model;  we  have  reliable  estimates  of  true 
significance  levels;  and  we  have  no  more  trouble  with  the  conflict  problem. 

Therefore,  with  an  efficient  estimation  method,  the  parametric  bootstrap 
method  is  highly  recommended,  but  the  nonparametric  bootstrap  method  does  not 
perform  well.  The  performance  of  a  semiparametric  bootstrap  method  with  an 
efficient  estimation  method  depends  on  different  models  and  the  detailed  structure 
of  the  bootstrap  method.  For  instances,  Flood's  augmented  semiparametric 
bootstrap  method  performs  satisfactorily  for  the  tobit  model,  but  not  for  the  logit 
and  probit  models.  The  mixed-augmented  semiparametric  bootstrap  method  does 
not  perform  satisfactorily  at  all. 
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